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SUMMARY 


The  present  study  deals  with  problems  of  estimation, 
specifically  the  kind  performed  by  military  officers  under 
conditions  of  uncertainty  and  stress.  Two  types  of  estimation 
were  examined!  estimation  of  factual  quantities,  and  revision 
of  probability  in  base-rate  type  problems.  Two  corrective 
procedures  were  tested.  The  first  procedure  was  Algorithmic 
Decomposition,  in  which  the  target  estimate  is  divided  into 
simple  and  known  sub-estimates.  These  are  combined,  according 
to  a  rule,  to  yield  the  target  estimate.  The  second  procedure 
was  Tutorial,  and  involved  creating  mental  images  and 
explaining  the  normative  solution  of  base-rate  problems.  The 
Algorithmic  Decomposition  was  tested  in  Experiments  I  and  II. 
Subjects  had  to  estimate  unknown  quantities  under  both  normal 
and  time  stress  conditions.  In  Experiment  I  an  algorithm  was 
provided  as  an  aid,  while  in  Experiment  II,  the  subjects  were 
trained  to  compose  their  own  algorithmic  aid.  The  results 
showed  that,  although  subjects  were  able  to  compose  algorithms, 
this  aid  was  not  optimal,  and  failed  to  improve  estimation. 

The  training  method  for  probability  assessment,  in  base-rate 
problems,  was  tested  in  Experiment  III  and  IV.  In  Experiment 
III  two  aids  were  tested:  algorithm  and  tutorial.  The  results 
showed  that  both  aids  were  effective.  However,  only  the 
tutorial  resulted  in  generalization.  In  Experiment  IV  The 
Training  by  Mental  Image  <TbMI)  method  was  tested.  The  results 
showed  that  the  TbMI  method  was  efficient  in  training  and 
improving  performance,  relative  to  the  control  group. 
Generalization  was  observed  for  trained  groups,  under  all 
conditions.  It  was  concluded  that  the  algorithmic 
decomposition  approach  was  inadequate  for  a  non-academic 
military  papulation,  since  it  imposed  high  mental  load  and 
diverted  attention  to  the  creation  of  the  algorithm.  It  may 
have  been  effective,  however,  if  it  were  adjusted  to  the 
differential  cognitive  styles  of  the  user  populations.  Since 
the  incorporation  of  mental  images  into  the  Tutorial  aid 
contributed  to  better  understanding  and  improved  performance, 
the  possibility  of  applying  this  method  for  training  for  the 
first  type  of  estimation.  It  is  recommended  that  the  use  of 
mental  images  should  be  further  developed  and  expanded  to  be 
used  in  a  Computer  Aided  Instructions  (CAD  plan. 


SUMMARY 


GENERAL  INTRODUCTION . .  1 

PART  At  Algorithmic  decomposition  as  an  aid 


•for  esti  mat  ions 

Introduction .  4 

Experiment  I . . .  7 

Experiment  II . 16 

Discussion  o-f  part  A . 22 


PART  Bs  Training  for  overcoming  the  Base-Rate 
Fallacy 

Introduction . . . 23 

A 

Experiment  III . . . 2B 


Experiment  IV.  . . . . 33 

DiscussioH' o-f  part  B . S3 


GENERAL  CONCLUSIONS 


REFERENCES 


56 


APPENDIX  As  Estimation  problems 

APPENDIX  Bs  Subjective  mental  load  questionnaire 


APPENDIX  Cl  Taining  in  composing  algorithms 


APPENDIX  Di  Base-rate  problems  •for  Experiment  III 
APPENDIX  E*  Tutorial  and  Algorithm  (Light  Bulb  version) 
APPENDIX  Fi  Base-rate  problems  for  Experiment  IV 
APPENDIX  Gs  Training  by  mental  image  (TBMI) 


BUfllflR* 


The  present  study  deals  with  problems  of  estimation, 
specifically  the  kind  performed  by  military  officers  under 
conditions  of  uncertainty  and  stress.  Two  types  of  estimation 
were  examined;  estimation  of  factual  quantities,  and  revision 
of  probability  in  base-rate  type  problems.  Two  corrective 
procedures  were  tested.  The  first  procedure  was  Algorithmic 
Decomposition,  in  which  the  target  estimate  is  divided  into 
simple  and  known  sub-estimates.  These  are  combined,  according 
to  a  rule,  to  yield  the  target  estimate.  The  second  procedure 
was  Tutorial,  and  involved  creating  mental  images  and 
explaining  the  normative  solution  of  base-rate  problems.  The 
Algorithmic  Decomposition  was  tested  in  Experiments  I  and  II. 
Subjects  had  to  estimate  unknown  quantities  under  both  normal 
and  time  stress  conditions.  In  Experiment  I  an  algorithm  was 
provided  as  an  aid,  while  in  Experiment  II,  the  subjects  were 
trained  to  compose  their  own  algorithmic  aid.  The  results 
showed  that,  although  subjects  were  able  to  compose  algorithms, 
this  aid  was  not  optimal,  and  failed  to  improve  estimation. 

The  training  method  for  probability  assessment,  in  base-rate 
problems,  was  tested  in  Experiment  III  and  IV.  In  Experiment 
III  two  aids  were  tested;  algorithm  and  tutorial.  The  results 
showed  that  both  aids  were  effective.  However,  only  the 
tutorial  resulted  in  generalization.  In  Experiment  IV  The 
Training  by  Mental  Image  <TbMI)  method  was  tested.  The  results 
showed  that  the  YbMI  method  was  efficient  in  training  and 
improving  performance,  relative  to  the  control  group. 
Generalization  was  observed  for  trained  groups,  under  all 
conditions.  It  was  concluded  that  the  algorithmic 
decomposition  approach  was  inadequate  for  a  non-academic 
military  population,  since  it  imposed  high  mental  load  and 
diverted  attention  to  the  creation  of  the  algorithm.  It  may 
have  been  effective,  however,  if  it  were  adjusted  to  the 
differential  cognitive  styles  of  the  user  populations.  Since 
the  incorporation  of  mental  images  into  the  Tutorial  aid 
contributed  to  better  understanding  and  improved  performance, 
the  possibility  of  applying  this  method  for  training  for  the 
first  type  of  estimation.  It  is  recommended  that  the  use  of 
mental  images  should  be  further  developed  and  expanded  to  be 
used  in  a  Computer  Aided  Instructions  (CAI)  plan. 
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Modern  military  decision-making  is  based,  in  many  cases,  on 
probability  assessments  and  estimation  o-f  unknown  quantities. 
These  have  been  extensively  studied  by  philosophers, 
mathematicians  and  statisticians  as  well  as  psychologists. 

Modern  cognitive  psychology  has  revealed  many  cognitive 
biases  and  limitations  which  endanger  the  rationality  and 
optima. ity  of  estimation,  assessment  and  decision  making  (e.g., 
Koriat,  Lichtenstein  8<  Fischheff,  19B0;  Kahneman,  Slovic  & 
Tversky  1782;  Zakay  &  Wooler,  1984).  Tversky  &  Kahneman 
< 1972a,  1973,  1974,)  showed  that  contrary  to  the  accepted 
assumption  that  people  perform  such  mental  operations,  in  an 
optimal  and  rational  manner,  the  principles  and  logic 
underlying  such  judgments  are  much  simpler  than  those  expected 
according  to  the  normative  models.  These  principles  are 
heuristic  rules  that  lead  to  judgements  that  may  differ 
essentially  and  consistently  from  those  derived  by  normative 
principles. 

For  example,  Kahneman  ?<  Tversky  (1973)  showed  that  subjects 
use  the  availability  heuristic  to  assess  the  probability  of  an 
event.  That  is,  the  probability  of  the  event  is  assessed  by 
the  ease  with  which  instances  or  occurrences  can  be  brought  to 
mind  (Kahneman,  Blovic  &  Tversky,  1982).  If  this  ease  is 
influenced  by  the  relative  frequency  of  the  event,  a 
probability  estimation,  based  on  this  heuristic,  would  be  a 
better  estimation  of  the  objective  probability.  This  ease, 
however,  is  also  influenced  by  factors  not  related  to  the 
relative  frequency,  such  as  familiarity,  emotional  salience  and 
the  like.  These  would  cause  systematic  biases  since,  for 
example,  a  single  dramatic  event  would  be  more  easily 
remembered  and  therefore  would  be  judged  as  more  probable. 

One  of  the  most  intensively  investigated  heuristics  is  the 
representativeness  heuristic.  (Kahneman  (k  Tversky,  1972a; 
Tversky  &  Kahneman,  19B2b).  This  heuristic  is  used  in 
performing  two  types  of  judgments:  What  is  the  probability  that 
object  A  belongs  to  class  B?  or  what  ip  the  probability  that 
event  A  originates  from  process  B7  People  tend  to  rely  on  the 
degree  to  which  A  is  representati ve  of  B.  When  A  is  highly 
representati ve  of  B,  the  probability  that  A  originates  from  B 
(or  B  generates  A)  is  judged  to  be  high.  This 
representati veness  can  be  expressed  by  the  degree  to  which  A 
resembles  B,  by  the  statistical  relation  or  causal  relation 
between  the  two  events. 


One  implication  of  the  representativeness  heuristic,  is  the 
Base-Rate  Fallacy.  When  asked  to  assess  the  probability  of 
occurrence  of  an  event,  in  light  of  previous  probability 
information,  people  tend  to  neglect  the  some  of  the  relevant 
information  (the  base-rate),  which  must  be  taken  into  account. 
This  is  called  the  base-rate  fallacy,  and  it  illustrates  the 
insensitivity  of  people  to  the  sample  domain  from  which  a 
certain  event  was  selected. 

In  making  estimates  of  unknown  quantities,  people  often 
start  from  an  initial  value,  or  starting  point,  and  then  make 
adjustments  to  produce  the  final  estimate.  Different  starting 
points  yield  different  estimates,  which  are  biased  toward  the 
anchor  which  is  the  initial  values  (Kahneman  it  Tversky,  1932). 

The  commander  in  the  future  battle  field  will  have  to 
estimate  unknown  quantities  and  assess  the  probabilities  of 
occurrence  of  various  events.  This  will  be  done  under 
conditions  of  uncertainty,  time  stress  and  information  load. 

The  stress  factor,  which  is  a  characteristic  of  battle  field, 
as  well  as,  many  other  decision  making  situations,  affects 
estimating  and  probability  assessing.  Technological  aids  that 
may  be  available  for  use  in  such  situations,  will  help  the 
expert  commander  to  utilize  more  information  and  to  make  better 
decisions  as  compared  to  the  present  commander.  However,  the 
tasks  of  estimating  quantities,  assessing  probabilities  and 
evaluating  the  validity  and  correctness  of  data  presented  to 
him  by  these  aids,  will  still  be  of  utmost  importance,  since 
the  commander  will  have  to  evaluate  the  information  presented 
to  him  and  decide  how  to  use  it  properly.  Therefore  it  is 
vital  to  find  ways,  or  aids,  for  helping  an  expert  commander  in 
optimally  performing  these  tasks.  Since  research  has  shown 
that  strategies,  effectively  applied  to  normal  conditions,  are 
not  transfered  to  stress  conditions  (Zakay  &  Wooler,  1984), 
such  aids  will  have  to  be  adjusted  to  stress  conditions. 

Various  such  aids  were  sugested  and  tested.  For  example, 
the  algorithmic  decomposition  technique  for  estimating  unknown 
quantities  (MacGregor,  Lichtenstein  &  Slovic,  1985).  As  aids 
in  probability  assessment,  mainly  for  solving  base-rate 
problem,  a  number  of  techniques  were  suggested,  For  example, 
the  Subjective  Sensitivity  Analysis  (SSA)  (Fischhoff,  Slovic  it 
Lichtenstein,  1979>  Fischhoff  it  Bar — Hi  1  lei,  19S4),  the  Balanced 
SSA  (BSSA) ,  Isolation  Analysis  (AI)  and  Minimal  Focusing  (FM) 
(Fischhoff  it  Bar-Hillel,  1984),  and  Structuring  Base-Rates 
(Lichtenstein  it  MacGregor,  1985).  These  aids  were  shown  to 
influence  people's  judgments,  however,  there  was  no 


contribution  to  their  understanding,  or  constructively  change 
the  way  in  which  people  canceputal ized  the  problems  (Fischhaff 
*  Bar-Hill  el,  1984). 

The  general  objectives  of  the  present  study  are  as  follows: 

a.  To  examine  the  effectiveness  of  two  aiding  methods,  the 
algorithmic  decomposition  technique  and  Training  by 
Mental  Images  (TbMI)j  and 

b.  To  validate  the  utility  of  applying  these  aiding  methods 
by  testing  them  on  military  officers,  under  stress 
conditions. 


The  study  was  carried  out  in  two  phases.  Phase  A  focused 
on  the  following  goals: 

a.  Testing  the  effectiveness  of  the  algorithmic 
decomposition  technique,  previously  investigated  by 
MacGregor,  Lichtenstein  &  Slovic  (1985),  on  Israeli 
military  population  under  time  stress  conditions; 

b.  Development  and  testing  the  application  of  a  training 
method  for  creating  algorithms  by  the  user  himself,  and 
testing  its  effectiveness  in  performing  general 
estimation  tasks  under  time  stress  conditions. 

Phase  &  focused  on  the  following  goals: 

a.  Testing  the  effectiveness  of  the  general  training 
methods  in  solving  base-rate  problems,  developed  by 
Lichtenstein  &  MacGregor  (198x),  on  Israeli  population. 

b.  Improving  this  aiding  method  by  introducing  mental 
imagery  and  testing  its  effectiveness  on  military 

-population,  using  problems  of  military  content  and  under 
time  stress  conditions. 

This  report  is  divided  into  three  parts.  In  part  A  the 
experiments  Involving  estimation  of  unknown  quantity  are 
described  and  discussed.  Part  B  describes  and  discusses  the 
experiments  involving  base-rate  problem  solving.  Part  C  is  a 
general  discussion  of  the  results  obtained  in  the  experiments 
described  in  part  A  and  B  of  this  report,  and  general 
conclusions. 


ALGORITHMIC  DECOMPOSITION  AS  AN  FOR  ESTIMATION 

Introduction 


In  many  decision  making  situation,  decisions  are  based  on  a 
set  of  data  concerning  different  aspects  of  the  situation  at 
hand.  These  aspects  usually  involve  the  values  of  certain 
quantities.  Some  of  these  values  are  readily  available  and 
easily  obtained  by  reference  to  various  information  sources. 

In  many  cases,  however,  the  necessary  data  is  unavailable  and 
therefore  has  to  be  estimated.  This  is  especially  true  for 
battle-field  situations,  when  the  availability  of  information 
is  restricted. 

If  reliable  decisions  are  to  be  made,  then  the  estimates 
must  be  made  as  accurately  as  passible.  In  addition,  in  many 
decision  making  situations  the  time  factor  is  very  critical, 
and  thus  the  estimates  must  also  be  made  as  quickly  as 
possible. 

When  considering  the  cognitive  strategies  people  adopt  in 
making  estimates,  it  is  realized  that  such  strategies  are 
usually  ineffective  and  lead  to  inadequate  estimates.  For 
example,  one  approach  is  to  consider  one's  knowledge  of  the 
quantity  being  estimated,  and  intuitively  guess  an  estimate 
that  seems  reasonable  in  light  of  whatever  knowledge  comes  to 
mind  (MacGregor,  Lichtenstein  !<  Slovic  <19B5>.  Another 
approach  is  to  start  from  an  initial  value,  and  then  adjust  it 
to  yield  the  final  estimate  (Kahneman,  Slovic  V.  Tversky,  1982). 
Estimates  performed  according  to  such  approaches,  are  highly 
influence  by  irrelevant  and  biasing  factors.  For  example,  the 
initial  values  may  become  anchors,  therefore  the  estimates  are 
biased  toward  them  (Tversky  it  Kahneman,  1974).  In  addition, 
recency,  salience  and  emotional  factors  influence  the  ease  with 
which  information  is  retrieved  from  memory.  This  may  affect 
the  adjustments  of  the  initial  values  (Nisbett  h  Ross,  1980). 
Since  the  resulting  estimates  are  determined  by  subjective 
factors,  they  do  not  represent  the  state  of  the  world. 

Decisions  based  on  such  intuitive  estimates  may  be  erroneous. 

The  inefficiency  of  the  intuitive  strategies,  and  the 
frequent  demand  for  accurate  estimates,  which  have  to  be  made 
as  quickly  as  possible,  make  way  to  development  and  application 
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of  aids,  specifying  alternative  approaches  to  making  estimates. 

The  algorithmic  decomposition  is  one  such  approach.  It 
Involves  the  division  of  an  estimation  question  to  a  series  of 
sub-questions,  the  answers  for  which  are  more  accurate,  easily 
obtained  and  of  which  one  is  more  likely  to  have  available 
knowledge.  The  answers  to  the  sub-questions  can  then  be 
combined,  according  to  a  rule,  to  yield  the  answer  to  the 
original  estimation  question.  The  resulting  estimate  would  be 
more  accurate  than  a  direct  estimate.  The  approach  of  analysis 
and  decomposition  is  based  on  the  concept  of  structuring 
information  in  accordance  with  knowledge  organization  in  human 
memory,  in  order  to  obtain  and  utilize  information  from  various 
external  sources,  as  well  as  retrieval  from  memory.  It  is 
assumed  that  an  intuitive  wholistie  strategy  of  estimation, 
which  incorporates  lass  knowledge,  creates  a  "vacuum",  into 
which  the  heuristics  are  introduced.  By  using  the  algorithmic 
decomposition  approach,  this  “vacuum"  can  be  filled  with 
knowledge,  and  reduce  out  the  biases. 

The  utility  of  the  algorithmic  decomposition  approach,  was 
tested  by  MacGregor,  Lichtenstein,  &  Slovic,  (19B5).  Their 
subjects  had  to  estimate  the  answers  to  various  questions 
concerning  unknown  quantities.  This  was  done  under  five  aiding 
conditions,  constituting  different  structuring  levels,  as 
f ol lows: 

a.  Full  Algorithm.  In  this  condition,  each  question  was 
decomposed  into  a  complete  algorithm.  Subjects  had  to 
estimate  the  answers  to  the  sub-question  and  then 
combine  the  sub-estimates  according  to  the  provided 
arithmetic  rules  that  were  provided. 

b.  Partial  Algorithm.  This  condition,  was  similar  to  the 
full  algorithm,  however,  the  arithmetic  rules  were  not 
provided. 

c.  List  b  Estimate.  In  this  condition,  subjects,  first, 
listed  components  or  factors  that  they  believed  were 
relevant  to  estimating  the  target  quantity;  they  then 
estimated  each  of  the  components  they  had  listed;  after 
completing  these  tasks,  the  subjects  made  the  target 
estimates. 

d.  List.  This  condition  was  similar  to  list  b  estimate, 
however,  the  subjects  were  not  requested  to  make 

estimates  of  the  factors  they  have  listed. 

\ 


0.  Unaided.  In  this  condition,  no  Aiding  was  provided  to 
subjects  in  making  the  target  estimates. 

The  results  of  this  experiment  showed  that  accuracy  and 
consistency  was  improved  with  increasing  the  structuring  level. 
That  is, the  partial  algorithm  and  full  algorithm  condition 
produced  more  accurate  and  consistent  estimates  then  the  list  & 
estimate,  list  and  unaided  conditions.  Similarly,  the  list 
condition  led  to  more  accurate  estimates  than  the  unaided 
condition,  however,  the  list  &  estimate  condition  did  not  help 
the  subjects  focus  more  directly  on  the  magnitude  of  the  value 
they  had  to  estimate. 

In  their  discussion,  MacGregor,  Lichtenstein,  fc  Slovic 
(1985),  pointed  out  that,  although  their  results  showed  that 
people  can  perform  estimates  according  to  specified  algorithms, 
in  evaluating  the  quality  of  such  an  approach  one  has  to  take 
into  consideration  the  representation  of  the  estimation 
questions.  Estimation  question  can  be  represented  in  several 
ways,  each  of  them  may  influence  the  subject's  performance  in  a 
different  manner.  A  representation  which  is  effective  for  some 
people  may  be  biasing  and  misleading  for  others.  This  means 
that  to  be  a  useful  aid,  an  algorithm  must  be  compatible  with 
the  specific  needs  of  each  user.  In  addition,  in  everyday,  and 
especially,  battle-field  decision  making  situations,  algorithms 
are  not  provided,  instead  they  have  to  be  produced  by  the 
decision  maker. 

Battle-field  decision  making,  is  performed  under  time 
stress  conditions,  and  is  known  to  be  character! zed  by  unique 
cognitive  processes.  Ben  Zur  and  Breznitz  (1981)  found  that 
under  high  time  pressure,  a  filtration  mechanism  was  activated, 
that  is,  "Information  which  was  perceived  by  the  individual  as 
most  Important  was  processed  first,  and  then  processing  was 
continued  until  time  was  up"  (p.  102).  Research  has  shown  that 
framing  is  not  transferred  to  stress  conditions.  In  addition, 
Zakay,  (19B5)  found  that  time  pressure  would  lead  to  more 
frequent  use  of  non-compensatory  strategies. 

In  view  of  the  above,  the  algorithmic  decomposition  may 
become  an  effective  aid,  if  taught  and  used  as  a  mental  model 
of  estimation.  This  would  enable  people  to  compose  their  own 
individual  algorithms,  which  would  be  compatible  with  their  own 
mental  processes  and  needs.  .Since,  the  utility  of  this 
approach  may  be  dependent  on  the  user,  and  can  be  affected  by 
stress  conditions  (similarly  to  other  cognitive  aids),  it  is 


7- 


important  to  test  its  ef f »cti vencia  on  various  populations, 
mainly  military  populations,  using  stress  conditions. 

The  purpose  ef  the  present  study  is  to  evaluate  the  algorithmic 
decomposition,  as  a  method  for  estimating  quantities,  by 
military  officers  under  normal  and  time  stress  conditions.  An 
additional  goal  is  to  develop  and  validate  a  method,  based  of 
the  algorithmic  decomposition  approach,  for  training  military 
officers  in  composing  and  using  their  own  algorithms  when 
estimating  quantities,  under  normal  and  time  stress  conditions. 


EHPirlinintL.l 


Experiment  1  was  designed  to  test  the  effectiveness  of 
algorithmic  decomposition,  as  an  aid  in  estimating  unknown 
quantities,  on  military  population.  Experiment  I  is  a  partial 
replication  of  the  study  carried  out  by  MacGregor, 

Lichtenstein,  &  Slovic,  (19B5).  However,  of  the  five 
structuring  levels  used  in  the  original  experiment,  only  the 
full  algorithm,  which  was  found  to  be  the  most  effective,  and 
the  unaided  control  levels  were  used  in  the  present  experiment. 

Mftha4 


Subjects.  Seventy  one  IDF  Junior  officers  participated  in 
experiment  I.  The  subjects  have  had  secondary  education. 

Estimation  tasks.  All  the  subjects  were  asked  to  perform  B 
different  estimation  tasks.  The  questions  were  of  the  type 
"How  much  ruod  (in  Kg.)  does  one  person  eat  during  his  entire 
life".  The  correct  answers  to  the  questions  varied  in 
magnitude  from  27,040,000,000  to  100.  The  estimation  question 
and  the  correct  answers  are  show  in  Appendix  A.  The  questions 
were  based  on  quantities  contained  in  statistical  almanacs.  A 
pilot  study,  revealed  that  the  subjects  were  unlikely  to  know 
the  exact  quantities  to  which  the  estimation  questions  related, 
however,  the  they  did  have  some  relevant  knowledge,  on  which 
the  sub-estimates  could  be  based. 

Experimental  design.  Two  independent  variables  were 
manipulated  in  this  experiment!  aid  type,  and  time  restriction. 

Aid  type:  In  the  aided  condition  a  full  algorithm  was 
provided  as  an  aid  in  answering  each  estimation  question. 

Under  this  condition  subjects  had  to  estimate  each  component  of 
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the  algorithm  and  than  combi n*  tha  astimatas  according  to  tha 
arithmetic  rula  provided  (this  condition  is  denoted  AL> .  For 
example i 


a.  What  is  the  population  of  Israel? 

b.  What  proportion  of  the  population  smokes? 

c.  What  is  the  number  of  smokers  in  Israel? 

[multiply  <a>  by  (b>3. 

d.  How  many  cigarettes  does  the  average  smoker  consume  per 
day? 

e.  How  many  cigarettes  are  consumed  in  Israel  per  day? 
Cmultiply  the  answer  to  (c>  by  (d> 3. 

f.  How  many  days  are  there  in  a  year? 

g.  How  many  cigarettes  are  consumed  in  Israel  in  a  year? 
[multiply  <e>  by  <f>3. 

In  the  unaided  condition,  which  served  as  the  control 
condition,  no  aid  was  provided.  Under  this  condition  the 
target  quantities  had  to  be  estimated  directly  (this  condition 
is  denoted  C) .  Each  target  quantities  used  in  the  study  was 
estimated  directly. 

Time  restriction!  There  were  two  time  restriction 
conditions.  In  the  unlimited  time  condition,  subjects  had  to 
make  the  estimates  without  any  time  restriction  (this  condition 
is  denoted  NTL) .  In  the  limited  time  condition  subjects  had  to 
answer  each  estimation  problem  within  2  minutes  (this  condition 
is  denoted  TL).  The  2  minutes  limit  was  determined  by  running 
the  unlimited  time  AL  group  first,  and  measuring  the  time 
required  to  make  each  estimation.  The  minimum  time  was 
selected  as  the  time  limit  for  the  time  restricted  groups. 

The  experimental  design  with  two  independent  variables, 
which  make  up  4  different  groups,  is  presented  in  Table  1. 
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Table  It  The  Experimental  Option  of  Experiments  I  and  II 


^v‘xvaid 

TIME  LIMl^w 

AIDED 

UNAIDED 

LIMITED 

UNLIMITED 

The  dependent  variables.  Two  dependent  variables  were 
measured  in  this  experiment!  subjects'  estimates  and  subjective 
mental  load. 

Subjects*  estimates.  Subjects’  responses  for  each 
estimation  task  were  recorded.  The  measure  for  accuracy  of 
estimation  was  computed  as  follows: 


IX-Yt 


Where  V  is  the  measure  of  accuracy,  X  is  the  subject’s 
estimation  and  Y  is  the  correct  answer.  This  transformation 
was  selected  to  enable  direct  comparisons  between  different 
estimates,  regardless  of  their  actual  magnitude.  Absolute 
value  were  used,  since  the  direction  of  the  error  is 
irrelevant . 

Subjective  mental  load  measures.  Subjective  ratings  of 
subjective  mental  load  were  used  to  measure  the  subjects' 
subjective  mental  load  in  the  various  experimental  conditions. 
Subjects  rated  the  extent  of  difficulty,  mental  effort, 
fatigue,  frustration,  and  time  limit  they  have  encountered 
while  working  on  the  problems.  The  rating  were  made  on  a  seven 
point  scale  ranging  from  "very  easy",  "little  effort",  "not 
tiring  at  all",  "not  frustrating  at  ail"  and  "enough  time"  (1) 
to  "very  difficult",  "a  lot  of  effort",  "very  tiring",  "very 
frustrating"  and  "not  enough  time".  These  were  later  used  for 
obtaining  a  measure  of  subjective  mental  load.  The  scales 
(shown  in  Appendix  B>  were  adopted  from  Vidulich  &  Tsang, 

< 1985) . 


Procedure.  The  subjects  wars  randomly  assigned  to  four 
groups  of  15  to  20  officers.  Each  group  of  respondents  was  run 
separately  in  small  classrooms  at  an  army  base.  The 
instructions  preceding  each  session,  indicated  that  the  purpose 
of  the  study  was  to  examine  the  ways  in  which  military  decision 
makers  solve  various  problems.  Zt  was  emphasized  that 
participation  in  the  experiment  was  anonymous,  and  that 
subjects”  performance  would  not  affect  their  career. 

The  experimental  sessions  were  the  same  for  all  the  groups. 
Subjects  were  given  a  brief  introduction  to  the  study  and  then 
filled  in  a  standard  form  containing  details  such  as  age,  sex, 
months  of  service,  current  Job,  command  experience,  education 
and  the  like.  The  subjects  thc/n  received  the  eight  estimation 
problems,  and  answered  them  under  the  experimental 
conditions, to  which  they  were  assigned.  During  the  sessions, 
pocket  calculators  were  available  for  use,  in  order  to  avoid 
arithmetic  errors  which  might  affect  the  estimates.  Subjects 
in  the  TL  groups  were  not  told  at  the  beginning  of  the 
experimental  session,  that  they  would  be  time  restricted  later. 
In  this  condition  all  the  subjects  in  a  TL  groups  were  asked  to 
make  each  estimate  within  a  fixed  time  interval,  whose  starting 
and  ending  points  were  signaled  by  the  experimenter .  After 
completing  the  above,  the  respondents  completed  the  subjective 
mental  load  questionnaire. 

RffPUUP 

Accuracy  of  Estimation 

The  means  and  standard  deviations  (s.d.)  of  the  accuracy 
measures  are  presented  in  Table  2  for  each  estimation  question. 

Table  2s  Means  and  Standard  Deviations  of  Accuracy  Measures 
< Uooer  no.=  means  Lower  no.»  s.d.) 


|  QUESTION 

vi  ! 

1  v,  1 

V3  1 

V-i  1 

V5  | 

vb  I 

1  V7  1 

V0  !  1 

l^i 

U 

m 

TL 

NTl: 

Tl 

NTL 

m 

•JS§ 

IfTt. 

tl  : 

NTl 

u 

TL 

NIL 

|,J 

.80 

.86 

1.62 

83 

81 

1.36 

80 

8.67 

OB 

3.03 

^”3 

.00 

.83 

7.08 

AIDED 

4.04 

.78 

2.18 

28 

22 

3.84 

.40 

27.49 

5.65 

1.03 

.36 

27.12 

B3 

mi 

.88 

.58 

AS 

87 

86 

.74 

.68 

1.04 

.54 

.69 

1.28 

.80 

89 

UNAlUpU 

1.37 

.27 

.85 

.04 

88 

86 

1.23 

.89 

.73 

85 

2.11 

83 

1.16 

K9 

UNAIDED 
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each  estimation  question,  are  shown  in  Table  5.  No  interaction 
effects  were  obtained. 

Table  3s  Summary  Table  of  ANQOA  of  Accuracy  Measures  for 
each  Estimation  Question,  (No.  indicate  FC1.67) 
values) 


^VOUESTION 

VARIABLE*^ 

■•.V2v$ 

V3 

V4' 

V5 

V6 

fMl 

AID  TYPE 

1.140 

*4.255 

•*11.121 

.164 

3.274 

*3.991 

.217 

2.113 

TIME  LIMIT 

.468 

2.626 

.120 

.816 

.220 

1.611 

1.346 

.027 

**  p< . 01 


Table  3  shows  that  the  mean  accuracy  measures  are  higher 
for  the  AL  groups  than  for  the  C  groups,  for  the  majority  of 
questions  (e.g.,  01,  02,  04,  05,  06,  and  08).  The  opposite  is 
observed  only  for  questions  03  and  07.  The  mean  accuracy 
measures  are  higher  for  the  NTL  groups  than  for  the  TL  groups 
for  questions  01,  02,  03,  and  06.  The  opposite  is  observed  for 
questions  04,  05,  07,  and  08. 

Subjective  mental  load 

Difficulty.  The  difficulty  mean  ratings  and  s.d.  are 
presented  in  Table  4. 

Table  4s  Means  and  Standard  Deviations  of  Difficulty 
Ratings  (Uooer  No."  Mean;  Lower  No.feS.d.) 


N.  AID 

/* 

TIME^S^ 

AIDED 

UNAIDED 

TOTAL 

ilUNLIMITEDf 

3.14 

4.75 

4,09 

1.16 

1.65 

1.80 

LIMITED  ' 

3.75 

4.07 

3.88 

i 

1.66 

1.64 

1.64 

TOTAL 

3.48 

4.47 

3.99 

_ 

1.64 

1.66 

1.17 
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Table  4  indicates  that  the  mean  difficulty  rating  across 
all  groups  is  3.99.  The  mean  ratings  are  higher  for  the  C 
groups  (4.47,  s.d. -1.66)  than  for  the  AL  groups  (3.48, 
s.d. -1.64).  The  mean  ratings  are  higher  for  the  NTL  groups 
(4.09,  s.d. -1.80)  than  for  the  TL  groups  (3. 88,  s.d. -1.64). 

An  analysis  of  variance  on  these  data  indicated  significant 
effect  of  aid  (F(l,63>«5.77,  p<.05>. 

Mental  effort.  The  mental  effort  mean  ratings  and  s.d.  are 
shown  in  Table  5. 

Table  5:  Means  and  Standard  Deviations  of  Mental  Effort 
Ratings  (Upper  No.-,  Meant  Lower  No. -S.d.) 


N.  AID 

yy,:'  .x-y,,..:: 

:  .  ; 

timeN^ 

|  AIDED 

UNAIDED 

TOTAL 

3.29 

4.40 

3.94 

•  • .  '  '  ' 

1.68 

1.93 

1.87 

LIMITED 

3.68 

4.07 

3.85 

....  :: 

1.45 

1.44 

1.44 

TOTAL 

3.52 

4.26 

3.90 

|i|| 

1.54 

1.73 

1.66 

The  data  in  Table  5  indicate  that  the  mean  mental  effort 
rating  across  all  groups  is  3.90.  The  mean  ratings  are  higher 
for  the  C  groups  (4.26,  s.d. -1.73)  than  for  the  AL  groups 
(3.52,  s.d. *1.54).  The  mean  ratings  are  higher  for  the  NTL 
groups  (3.94,  s.d. *1.87)  than  for  the  TL  groups  (3.85, 
s.d. -1.44) . 

An  analysis  of  variance  on  these  data  showed  no  significant 
main  effects. 

Fatigue.  The  fatigue  mean  ratings  and  s.d.  are  shown  in 
Table  6.  The  data  in  Table  6  show  that  the  mean  fatigue  rating 
across  all  groups  is  3.55.  The  mean  ratings  are  higher  for  the 
AL  groups  (4.06,  s.d. -1.75)  than  for  the  C  groups  (3.03, 
s.d. *1.87).  The  mean  ratings  are  higher  for  the  NTL  groups 
(3.84,  s.d. -1.86)  than  for  the  TL  groups  (3.27,  s.d. -1.84). 


Table  6: 


TIME^^ 

VUOED 

*  ..  . 

UNAIDED 

}.  * 

TOTAL 

UNMNirTED 

4.43 

3.39 

3.84 

3.93 

1.93 

1.86 

ilMITEO 

.  :  : : : :  :  —  ' 

3.79 

2.57 

3.27 

1.81 

1.70 

1.84 

TOTAL 

4.06 

3.03 

3.55 

^-';v '' 

1.75 

1.87 

1.86 

An  analysis  of  variance  on  these  data  indicated  significant 
main  effect  for  aid  type  <F<i,6i)*6.32,  p<.05i. 

Frustration.  The  frustration  mean  ratings  and  s.d.  are 
shown  in  Table  7. 

Table  7s  Means  and  Standard  Deviations  of  Frustration 


X  AID 

\  !§ded1 

TIME  \ 

unaided: 

TOTAL 

.UNLIMITED,  3.93 

4.17 

4.06 

2.02 

1.87 

1.91 

LIMITED  4.26 

4.00 

4.15 

m  :  V  2.05 

1.75 

1.91 

total;  ;  4.12 

4.09 

4.11 

MM9P  2.01 

1.80 

1.90 

In  Table  7  it  can  be  seen  that  the  mean  frustration  rating 
across  all  groups  is  4.11.  The  mean  ratings  are  higher  for  the 
AL  groups  <4.12,  s.d. “2. 01)  than  for  the  C  groups  <V09, 
s.d.«1.80>.  The  mean  ratings  are  higher  for  the  TL  groups 
(4. IS,  s.d. “1.91)  than  for  the  NTL  groups  <4.06,  s.d.«1.91>. 
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An  analysis  o f  variance  on  these  data  failed  to  reach 
significance. 

Subjective  time  stress.  The  time  stress  mean  ratings  and 
s.d.  are  shown  in  Table  8. 


Table  8i 
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S.  aid 

i  1 

timeN^ 

AIDED 

UNAIDED 

TOTAL 

UNLIMITED 

1.84 

1.39 

1.60 

1.39 

.83 

1.09 

LIMITED 

2.68 

3.71 

3.12 

. 

2.60 

1.94 

2.01 

TOTAL 

2.24 

2.41 

2.32 

1.82  | 

1.82 

1.81 

The  data  in  table  8  indicate  that  the  mean  time  stress 
rating  across  all  groups  is  2.32.  The  mean  ratings  are  higher 
for  the  C  groups  <2.41,  s.d.»1.82)  than  for  the  AL  groups 
(2.24,  s.d. sal. 82).  The  mean  ratings  are  higher  for  the  TL 
groups  <3.12,  s.d.»2.01)  than  for  the  NTL  groups  <1.50, 
s.  d .  «=1 . 09 )  . 

An  analysis  of  variance  on  these  data  indicated  significant 
main  effect  for  time  restriction  <F ( 1 , 61 ) e17. 23,  p<.01). 

Subjective  mental,  load.  The  computed  values  of  subjective 
mental  load  are  shown  in  Table  9. 


Table  9« 


ec.nvg. 
X.  AID 
timeN 

UNUMITEC 
LIMITED 
TOTAL  A 


AIDED 

UNAIDED: 

;:TOTAL 

3.33 

3.62 

3.49 

1.30 

1.21 

1.24 

3.63 

3.69 

3.65 

1.27 

.88 

1.09 

3.60 

3.65 

3.57 

1.28 

1.06 

1.17 

The  date  in  Table-  9  show  that  the  mean  subjective  mental 
load,  across  all  groups,  is  3.57.  These  values  are  higher  -for 
the  C  groups  (3.65)  than  -for  the  AL  groups  (3.50).  The  values 
are  higher  -for  the  NTL  groups  (3.65)  than  -for  the  TL  groups 
(3.49) . 

An  analysis  of  variance  on  these  data  failed  to  reach 
significance. 
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The  results  of  Experiment  I  show  that  the  algorithmic 
decomposition  aid,  for  estimation  of  unknown  quantities,  did 
not  lead  to  more  accurate  estimates,  but  rather,  it  caused 
enlargement  of  the  errors.  That  is,  the  difference  between  the 
subjects’  estimates  and  the  correct  answers  is  usually  larger, 
when  performed  with  the  algorithmic  aid,  than  without  any  aid. 

These  results  are  not  in  line  with  those  found  by  MacGregor, 
Lichtenstein  It  Slovic,  1985.  Their  results  indicated  of  more 
accurate  estimates,  when  the  algorithmic  decomposition  aid  was 
provided.  The  reason  for  these  contradi ctory  findings,  may  lie 
in  the  difference  between  the  populations  tested  in  each 
experiment.  The  subjects,  participating  in  MacGregor, 
Lichtenstein  &  Slovic’s  experiment  were  university  students. 

For  members  of  such  a  population,  strategies  such  as  the 
algorithmic  decomposition,  may  be  compatible  with  their  own  way 
of  thinking,  which  was  acquired  as  a  habit,  and  became  more 
intuitive.  On  the  other  hand,  the  population  tested  in 
Experiment  I,  was  IDF  junior  officers,  whose  formal  education 
was  non-academic.  These  subjects  are  likely  to  have  different 
and  unique  patterns  of  thinking,  adapted  to  their  usual  tasks 
and  acquired  while  performing  them.  These  intuitive  patterns 
may  be  quite  different  than  those  imposed  by  the  algorithmic 
decomposition  aid,  thus,  leading  to  inaccurate  estimates. 

Apart  from  the  adequacy  or  inadequacy  of  the  algorithmic 
approach,  one  should  consider  the  fact  that  estimation  errors, 
occurring  in  making  the  sub-estimates,  influence  the  odds  for, 
and  the  size  of  error  in  the  target  estimate,  relative  to 
wholistic  direct  estimates.  For  example,  suppose  a  user 
exhibits  a  tendency  for  over  estimation,  therefore  over 
estimating  each  component  of  the  algorithm.  The  accumulated 
effeet  of  this  over  estimation  would  lead  to  target  estimates 


which  are  much  larger  then  estimate*  performed  directly 


In  addition  to  the  problems  caused  by  the  algorithmic 
decomposition,  as  a  strategy  of  estimation,  the  content  of  the 
algorithms  (e.g.,  the  specific  sub-estimates,  The  number  of 
sub-estimates,  estimation  of  proportion  Vs.  integers,  etc.) 
may  influence  performance.  However,  this  aspect  may  be  more 
dependent  on  the  cognitive  style  of  the  individual  user,  than 
on  the  characteristics  of  the  population  in  which  he/she  is  a 
member. 

This  suggests  that  the  aiding  approach,  and  its  content 
have  to  be  adapted  to  the  specific  user  population.  This  is 
especially  important  when  dealing  with  military  population  and 
should  take  into  account  the  unique  conditions  under  which  it 
operates. 

The  subjective  mental  load  measures,  obtained  here,  seem  to 
support  these  interpretations.  Subjects  in  the  aided  groups 
reported  higher  difficulty  and  mental  effort  than  those  in  the 
unaided  groups.  Subjects  in  the  time  limited  groups,  also 
reported  higher  difficulty  and  mental  effort  than  those  in  the 
unlimited  time  groups.  This  may  indicate,  that  wholistic 
direct  estimation  cause  high  subjective  mental  load  that  can  be 
reduced  by  aiding.  The  higher  degree  of  fatigue  and 
frustration,  however,  reported  by  the  aided  subjects,  may  be  a 
result  of  the  incompatibility  of  the  specific  aid  tested. 


Experiment  it. 

Experiment  I  examined  the  effectiveness  of  the  algorithms, 
when  it  is  fully  provided,  as  an  aid,  by  the  experimenter . 

Since  in  real  life  an  algorithm  must  be  composed  by  the  user,  a 
method  was  developed  in  order  to  train  people  in  creating  their 
own  algorithms.  Experiment  II  was  design  to  test  the 
effectiveness  of  such  a  training  method,  on  military 
population. 

Method 

Subjects.  Seventy  one  IDF  Junior  officers  participated  in 
Experiment  II.  The  subjects  have  had  secondary  education. 

Estimation  tasks.  The  estimation  questions  employed  in 
Experiment  II  were  the  same  as  those  used  in  Experiment  1. 
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Experiment  II  were  the  same  as  those  used  in  Experiment  I. 


Exaerlmental^desisn  end  procedure.  The  experimental  design 
and  procedure  in  Experiment  II  was  similar  to  those  of 
experiment  I.  However,  no  algorithms  were  provided,  instead, 
before  answering  the  questions,  the  aided  subjects  had  to  read 
a  detailed  tutorial,  describing  the  algorithmic  decomposition 
approach  and  why  it  should  be  used,  The  tutorial,  shown  in 
Appendix  C,  also  explained  how  to  create  an  algorithm,  and 
contained  two  training  estimation  questions.  No  training  was 
provided  for  the  control  groups. 

BBlUltc 

Accuracy  of  Estimation 

The  mean  and  s.d.  of  the  accuracy  measures  are  presented  in 
table  10  for  each  estimation  question. 


Table  10s  Means  and  Standard  Deviations  of  Accuracy 
Measures  (Upper  no. =  mean:  Lower  no.“  s.d, > 


|  QUESTION 

.  -VT  V 

L  « 

1  V3  *  I 

1  •  V4.  1 

1  -•  V5  1 

1  •'  • V6  1 

f - 

7  1 

1  V8  i  1 

isl 

TL 

NTL 

TL'  • 

• 

NTL 

TL 

a 

mm 

m 

NTL 

.  TL.  ' 

NTL 

n 

NTL 

TL 

sa 

■ 

TL 

19 

AIDED 

1.8S 

.62 

1.74 

1.17 

1.00 

.89 

5.53 

1.07 

8.33 

2.48 

3.12 

3.31 

.78 

.73 

3.94 

1.43 

2.63 

.52 

2.06 

1.70 

.16 

.25 

20.58 

1.05 

16.87 

3.00 

10.53 

9.30 

.33 

.25 

13.30 

3.05 

UNAIDED 

.69 

.73 

1.38 

1.01 

.97 

.99 

4.45 

4.92 

3.92 

6.12 

1.14 

BjjjPPl 

.75 

.77 

.46 

.73 

.42 

.69 

1.97 

1.79 

.19 

.03 

.41 

10.9W 

3.30 

11.71 

.57 

.16 

.28 

.31 

.52 

The  main  effects,  obtained  in  analyses  of  variance,  for 
Peach  estimation  question,  are  shown  in  Table  11.  No  main  or 
interaction  effects  were  obtained. 

Table  11:  Summary  Table  of  ANOVA  of  Accuracy  Measures 
for  each  Estimation  Question.  (No.  indicate 
Ft  1.67)  values) 


^QUESTION 

VARIAaiE'V^ 

Vi 

B 

V3 

M0 

B 

B 

V  7  | 

.AID  TYPE 

2.366 

JS  44 

1.506 

027 

.610 

.155 

.006 

'TIME  UMIT 

*:-.v  *>  •  4f% 

3.366 

1.166 

1.932 

.000 

1.204 

.718 

.045 
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Table  11  shows  that  the  naan  accuracy  measures  are  higher 
•for  the  aided  groups  than  for  the  unaided  groups  for  questions 
VI,  V2,  V4,  V5,  and  V8.  The  opposite  is  observed  for 
questions  V3  and  V6.  The  mean  accuracy  measures  are  the  same 
for  the  aided  and  unaided  groups  for  question  V7.  The  mean 
accuracy  measures  are  higher  for  the  unlimited  time  groups  than 
for  the  limited  time  groups  for  questions  V4y  and  V6.  The 
opposite  is  observed  for  questions  V2,  V3,  VS,  V7,  and  VQ. 

Subjective  mental  load 

Difficulty.  The  difficulty  mean  ratings  are  presented  in 
Table  12. 

Table  12!  Beam jmd  .standard  Deviations  of  Difficulty 
Ratings  ..  (Upper.  No. ■  Meant  Lower  No.=S.d.> 


AID 

t  •  .•  ,•  i  -  j  •  * 

tinieX^ 

AIDED 

UNAIDED 

TOTAL 

t:- : '  . 

UNLIMITED; 

3.81 

4.40 

4.14 

1.28 

1.96 

1.69 

LIMITED 

3.39 

5.15 

4.50 

1.28 

1.68 

1.16 

TOTAL 

3.87 

4.70 

4.32 

1.26 

1.86 

1.64 

The  data  in  table  12  indicate  that  the  mean  rating,  across 
all  groups,  is  4.32.  The  mean  ratings  are  higher  for  the 
unaided  groups  <4.70,  s.d.*1.86>  than  for  the  aided  groups 
<3.87,  s. d . ~1 .26) .  The  mean  ratings  are  higher  for  the  limited 
time  groups  <4.50,  s.d.**1.16>  than  for  the  unlimited  time 
groups  <4.14,  s.d.»*1.69). 

An  analysis  of  variance  on  these  data  indicated  significant 
effect  of  aid  <F<1,60>«4.63,  p<.05>. 

Mental  effort.  The  mental  effort  mean  ratings  are  shown  in 
Table  13.  The  data  in  Table  13  show  that  the  mean  rating, 
across  all  groups,  is  4.30.  The  mean  ratings  are  higher  for 
the  aided  groups  <4.29,  s.d.«1.40>  than  for  the  unaided  groups 
<4.27,  s. d. *1.91).  The  mean  ratings  are  higher  for  the  limited 
time  groups  <4.46,  s.d.«1.84>  than  for  the  unlimited  time 
groups  <4.14,  s.d.*1.55>. 
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Tabie  i3»  Pgvlitlfl'n*  9f^«i|a.L  Ef tart 


s.  AID 

TIM^V 

'  mtv 

.  V 

UNAIDED 

TOTAL 

UNLIMITED 

4.38 

3.05 

4.14 

1.36 

1.70 

1.55 

LIMITED 

4.20 

4.77 

"4.4b  1 

1.47 

2.21 

1.84 

TOTAL 

4.20 

4.27 

4.30 

1.40 

1.91 

1.67 

An  analysis  of  variance  on  these  data  showed  no  significant 
main  effects. 

Fatigue.  The  fatigue  mean  ratings  are  shown  in  Table  14. 

Table  1 4 i  Means  and  Standard  Deviations  of  Fatigue 
Ftatipgcjj  \pnapc.  No-^Jriean.;  Lower. Jig. d,  ? 


N.  AID 

. 

TIME'S^ 

AIDED 

UNAIDED 

TOTAL 

unlimited 

3.53 

3.26 

3.39 

1.36 

1.51 

1.44 

limited 

3.27 

2.86 

3.07 

\ 

1.58 

1.29 

1.44 

TOTAL 

3.40 

3.06 

3.24 

-  ■  ■ 

1.46 

1.41 

1.44 

In  Table  14  it  wan  be  seen  that  the  mean  rating,  across  all 
groups,  is  3.24.  The  mean  ratings  are  higher  for  the  aided 
groups  (3.40,  s.d.»1.46>  than  for  the  unaided  groups  (3.06 
s.d.«=1.41).  The  mean  ratings  are  higher  for  the  unlimited  time 
groups  (3.39,  s.d.*1.44>  than  for  the  limited  time  groups 
(3.07,  s.d.=1.44). 


m 
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An  analysis  of  variance  on  these  data  showed  no  significant 
main  effects. 


Frustration.  The  frustration  mean  ratings  are  shown  in 
Table  IS. 

Table  ISt  Means  and  Standard  Deviations  of  Frustati.on. 

Ratings  (Upper  No.*  Meani  Lowe£_No. “£. djJ. 


N.  AID 

TIME'S^ 

AIDED 

UNAIDED 

TOTAL 

UNLIMITED 

3.27 

4.78 

4.09 

1.75 

2.24 

2.07 

LIMITED 

3.53 

4.21 

3.86 

2.00 

2.03 

2.03 

TOTAL"':-.!:-;;:; 

3.40 

4.53 

3.98 

1.84 

2.15 

2.04 

The  data  in  Table  15  indicate  that  the  mean  rating,  across 
all  groups,  is  3.9B.  The  mean  ratings  are  higher  for  the 
unaided  groups  <4.53,  s.d.=2.15>  than  for  the  aided  groups 
(3.40,  s. d. =1.84).  The  mean  ratings  are  higher  for  the 
unlimited  time  groups  <4.09,  s.d.*=2.07)  than  for  the  limited 
time  groups  (3.86,  s.d.=2.03). 

An  analysis  of  variance  on  these  data  indicated  significant 
main  effect  of  aid  <F < 1 , 58) “4. 88,  p<.05>. 

Subjective  time  stress.  The  mean  time  stress  ratings  are 
presented  in  Table  16.  The  data  in  Table  16  indicate  that  the 
mean  rating,  across  all  groups,  is  2.66.  The  mean  ratings  are 
higher  for  the  aided  groups  <2.90,  s.d.as1.99)  than  for  the 
unaided  groups  <2.44,  s.d.«1.81).  The  mean  ratings  are  higher 
for  the  limited  time  groups  <3.31,  s.d.*1.91)  than  for  the 
unlimited  time  groups  <2.90,  s.d.s172>. 
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Table  16i 


?wer  No.»S.d,) 


•a 

»'T 

i'| 

4»T 

I'l 

»' 


TIMEX^ 

UNLIMITED 

s-  "  <. 

TJMITED . 

TOTAL  , 

?  ‘  i 


aided  Unaided  total 


' 

1  £  ■  1 _ 

1.07 

2.28 

2.09 

1.SS 

1.87 

1.72 

3.93 

2.84 

3.31 

1.87 

1.78 

1.91 

2.90 

2.44 

2.66 

1.99 

1.81 

1.90 

HI 


An  analysis  of  variance  on  these  data  indicated  significant 
main  effect  for  time  restriction  (F ( 1 , 5B) =6. 94,  p<.05>. 


Subjective  mental  load.  The  computed  measures  for 
subjective  mental  load  are  shown  in  Table  17, 

Table  17t  Subjgcti  YE..Mgn.ti»l  .Uoad,.,  Values 


1 


a 


timeS^ 

AIDED 

UNAIDED  : 

TOTAL 

UNLIMITED. 

3.37 

3.72 

3.56 

1.04 

1.40 

1.25 

LIMITED 

3.77 

3.61 

3.79 

1.17 

1.01 

1.08 

TOTAL 

3.57 

3.76 

3.67 

1.11 

1.24 

1.18 

<3 


The  data  in  Table  17  show  that  the  subjective  mental  load 
measure,  across  all  groups  is  3.67.  These  measures  are  higher 
for  the  unaided  groups  (3.76,)  than  for  the  aided  groups 
(3.57).  The  mean  ratings  are  higher  for  the  limited  time 
groups  (3.79)  than  for  the  unlimited  time  groups  (3.56). 


An  analysis  of  variance  on  these  data  failed  to  reach 
significance. 


Content  analysis  of  the  questionnaires,  used  in  Experiment 
II,  indicated  that  the  subjects  did  learn  to  compose 
algorithms  and  were  able  to  apply  them  successfully.  However, 
the  results  of  Experiment  II  showed  that  the  training  method 
for  building  algorithms  was  not  effective.  It  did  not  lead  to 
more  accurate  estimates,  but  caused  enlargement  of  the 
estimation  errors.  That  is,  the  difference  between  the 
subjects’  estimates  and  the  correct  answers  was  usually  larger, 
when  performed  with  the  algorithmic  aid,  than  without  any  aid. 
These  results  are  in  line  with  those  found  in  Experiment  I. 
Therefore,  it  is  likely  that  the  reason  for  the  subjects'  poor 
performance,  is  the  inadequacy  of  the  algorithmic  decomposition 
approach.  As  shown  in  Experiment  I,  this  is  due  to  the 
incompatibility  of  this  approach  with  the  character i sti cs  of 
the  military  population,  used  in  these  experiments,  and  to  the 
accumulated  effects  of  biased  sub-estimation. 

When  considering  the  subjective  mental  load  measures,  the 
results  showed  that  the  untrained  subjects  found  their  task  to 
be  more  difficult  and  frustrating  than  the  trained  subjects. 
This  indicates  the  need  for  aiding.  The  higher  degree  of 
mental  effort  and  fatigue,  reported  by  the  trained  subjects, 
however,  suggests  that  the  actual  creation  of  an  algorithm  may 
be  highly  demanding,  and  may  divert  the  users'  attention  from 
the  estimation  itself. 

Both  Experiments  1  and  II  suggest  that  in  developing  an  aid 
or  training  method,  based  on  the  algorithmic  approach,  one 
should  take  into  account  the  unique  characteristics  of  the 
target  population,  and  adjust  the  aid  accordingly.  Such  an  aid 
would  be  more  compatible  with  the  thinking  patterns  and 
cognitive  style  of  the  target  population.  Only  after  the  aid 
and  training  method,  are  adapted  to  the  population,  its 
cognitive  style  and  thinking  patterns,  an  efficient  aiding 
method  can  be  introduced. 
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EMI_£ 

TRftININB  FQR  SVER5QMIN6  THE  BftgE-RftTE  .FflLLftPY, 

INTRPPUCnPN 

In  problem  solving  and  decision  making,  people  tend  to 
ignore  some  of  the  available  in-formation  even  though  it  is 
relevant.  Base-rate  problems  are  inference  problems  containing 
base-rate  information  about  a  certain  phenomenon  (prior  odds), 
information  about  the  degree  of  accuracy  of  a  given  diagnostic 
device  or  method  (usually  referred  to  as  the  diagnostic  or 
specific  information),  and  a  question  concerning  the 
probability  of  a  particular  event.  Base-rate  fallacy  is  the 
tendency  to  neglect  the  Base-rate  information  when  attempting 
to  solve  such  problems. 

The  most  common  example  of  the  base-rate  problem  is  the  Cab 
Driver  problem  (Tversky  &  Kahneman,  1972a): 

A  cab  was  involved  in  a  hit  and  run  accident  at  night.  Two 
cab  companies,  the  Green  and  the  Blue,  operate  in  the  city. 
You  are  given  the  following  data: 

a.  857.  of  the  cabs  in  the  city  are  Green  and  157.  are  Blue. 

b.  A  witness  identified  the  cab  as  Blue. 

The  court  tested  the  reliability  of  the  witness  under  the 
same  circumstances  that  existed  on  the  night  of  the 
accident  and  concluded  that  the  witness  correctly 
identified  each  one  of  the  two  colors  8051  of  the  time  and 
failed  207.  of  the  time. 

What  is  the  probability  that  the  cab  involved  in  the 
accident  was  Blue  rather  than  Green? 

Research  has  shown  that  when  presented  with  this  problem  or 
variations  of  it,  people  tend  to  consider  only  the  diagnostic 
information  (e.g.,  Kahneman  &  Tversky, 1973}  Lyon  &  Slovic, 
1976).  Therefore  the  answer,  usually  given  to  the  above 
problem  is  807.. 

The  normative  statistical  model  applicable  in  solving  such 
problems  is  Bayes'  rule  or  theorem.  It  is  useful  in  computing 
probabilities  of  various  hypothesis  which  have  resulted  in  a 


given  event  <Beyth-M*rem  &  Flsehhoff,  19B3) .  Bayes*  rule 
maintains  that  in  re-evaluating  the  state  of  the  world,  one 
■hould  consider  both  the  new  evidence  and  previous  knowledge. 

This  rule  is  formulated  as  follows! 


P (El/A) 


P<E1 )  *  P (A/El ) 

P <E1 )  *  P (A/El)  +  P(E2)  *  P <A/E2> 


Where  P(E1)  and  P(E2)  are  the  possible  states  of  the  world, 
and  P ( A)  is  the  new  evidence. 

Extensive  research  was  done  in  order  to  find  the  conditions 
under  which  base-rate  information  is  neglected  or  used.  Some 
reseachers  argued  that  the  biased  responses  of  their  subjects 
stemmed  from  the  content  of  the  problem  story.  Hammerton 
<1973)  asked  subjects  to  solve  base-rate  problems  in  which  the 
diagnostic  information  referred  to  the  degree  of  accuracy  of  a 
medical  diagnostic  test.  The  results  showed  that  subject’s 
judgment  were  dominated  by  the  diagnostic  information. 

Hammerton  argued  that  the  reason  for  this  result  was  that  the 
subjects  had  "rigid  prior  expectation"  that  such  tests  are 
infallible,  and  therefore  neglected  the  base-rate  information. 
When  the  problem  story  was  changed  to  prevent  this  bias, 
subjects’  responses  shifted  from  the  diagnostic  value,  yet  were 
still  higher  than  the  bayesian  answer. 

Lyon  &  Slovic  <1976)  investigated  the  effect  of  various 
aspects  connected  with  the  problem  story  on  the  degree  of  the 
bias.  They  investigated  aspects  of  content,  extreme  base-rate 
values,  presentation  order  of  the  information  and  response 
format.  The  base-rate  fallacy  was  observed  under  all  of  these 
conditions. 

There  are  many  practical  contexts,  in  which  people  use 
diagnostic  tools,  in  order  to  decide  which  of  two,  or  more, 
hypotheses  is  correct.  Such  contexts  are,  for  example, 
medicine,  law,  and  especially  intelligence  and  other  military 
domains.  In  this  cases,  ignoring  base  rates,  may  have 
undesirable  and  severe  consequence.  In  light  of  the  above 
empirical  evidence,  it  is  vital  to  develop  aids  in  order  to 
direct  people  in  probablity  assessment  of  this  type. 


Fisehhoff,  Slavic  b  Lichtenstein  (1979)  developed  the 
Subjective  Sensitivity  Analysis  (8SA)  procedure.  Using  base-* 
rate  problems,  previously  Involved  neglect  o-f  base-rate,  the 
SSA  procedure,  directed  the  subjects  to  first  consider  how  they 
would  perform  the  same  Judgment  with  various  base-rate  values, 
and  only  then  respond.  The  results  showed  that  this  procedure 
affected  subjects’s  Judgment  in  the  sense  that  they  were  closer 
to  the  normative  (Bayesian)  answer  than  is  usually  found. 
However,  this  improvement  was  not  generalized.  That  is,  after 
solving  base-rate  problems,  using  the  SSA  procedure,  subjects 
had  to  solve  other  similarly  structured,  base-rate  problems, 
without  SSA.  Again,  the  base-rates  were  ignored. 

Fisehhoff  b  Bar-Hill el  (1984)  further  investigated  the 
effect  of  the  SSA  procedure.  They  found  that  although  SSA 
consistently  increased  usage  of  base-rate  information,  it  did 
so  as  a  mechanical  procedure,  rather  than  contributed 
qualitatively  to  subjects’  comprehension. 

Fisehhoff  b  Bar — Hi  11  el  also  tested  three  alternative 
techniques  of  enhancing  a  variable’s  salience  (focusing 
techniques):  Isolation  Analysis  (1A),  which  encourages 
subjects  to  consider  each  information  in  turn,  judging  how  they 
would  respond  if  it  was  the  only  information  available,  Minimal 
Focusing  (MF) ,  which  instructs  subjects  explicitly  to  consider 
both  items  of  information,  and  Balanced  SSA  (BSSA)  which 
applies  SSA  separately  to  both  items  of  information.  These 
techniques  were  effective  in  changing  subjects’  performance,  in 
the  sense  that  subjects  did  not  ignore  the  base-rate 
information.  However,  this  change  can  not  be  attributed  to 
better  understanding,  since  the  base-rate  information  was  also 
considered  when  subjects  responded  to  other  problems  not 
requiring  utilizing  the  base-rate  information. 

Fisehhoff  b  Bar-Hill  el  concluded  that  "It  is  not  enough  to 
motivate  subjects  or  clarify  instructions  or  give  problems  with 
a  familiar  content.  In  order  to  improve  intuitive  judgement  a 
manipulation  must  constructively  change  the  way  in  which  people 
concepualize  a  problem,  or  give  them  new  cognitive  skills  with 
which  to  examine  it.  Thus,  instead  of  debiasing  procedures, 
there  may  be  a  need  for  training  programs"  (p  193). 

The  effectiveness  of  various  structuring  aids,  in  solving 
base-rate  problems,  was  investigated  by  Lichtenstein  b 
MacGregor  (19B5).  Their  subjects  were  required  to  solve  base- 
rate  problems  under  the  following  experimental  conditions: 


a.  Control .  In  this  condition,  the  subjects  had  to  solve 
the  problems  without  any  aid. 

b.  Li st.  In  this  condition,  the  subjects  had  to  list 
factors,  that  they  believed  were  relevant  to  the  answer) 
this  was  done  before  answering  the  problems. 

c.  Algorithm.  In  this  condition,  the  subjects  were  given  a 
full  algorithm  specifying  all  the  stages  of  the  correct 
solution.  They  were  required  to  extract  the  information 
from  the  problem,  assign  it  according  to  the 
instructions  and  do  the  specified  arithmetic. 

d.  Tutorial .  In  this  condition,  the  subjects  read  a  seven- 
page  tutorial,  specifying  the  way  of  solution  and 
explaining  why  this  was  the  correct  one. 

The  results  shown  that  the  list  condition  had  no  effect. 

On  the  other  hand,  the  algorithm  and  the  tutorial  aids  did 
affect  performance.  However,  generalization  was  observed  only 
for  subjects,  previously  aided  by  the  tutorial.  This  was 
manifested  by  the  fact  that  some  of  these  subjects  were  able  to 
solve  a  second  base-rate  problem,  without  the  aid  of  the 
tutorial . 

Lichtenstein  &  MacGregor  (1985),  concluded  that  "the 
tutorial  approach  holds  great  promise"  (p.  20),  since  their 
results  had  shown  that  subjects  can  be  taught  how  to  solve 
base-rate  problems  successfully,  in  a  relatively  short  period 
of  time,  without  individual  tutoring,  practice,  or  feedback. 
This  tutorial,  however,  led  to  systematic  errors  in  calculating 
the  target  probability.  In  their  opinion,  this  conceptual 
problem  might  be  rectified  by  re-writing  and  expanding  the 
tutorial . 


The  tutorial  aid  should  be  modified,  not  only  in  order  to 
over  come  "built  in"  errors,  but  also  to  increase  its  effect  on 
the  user.  In  addition,  it  is  important  to  examine  the 
applicability  of  this  aid  to  various  populations.  That  is,  the 
subjects  who  participated  in  the  above  were  American  college 
students.  Different  populations  have  unique  and  specific 
characteristics,  that  may  effect  the  capability  of  their 
members  to  learn  and  generalize  the  material,  presented  to  them 
by  such  aid.  Therefore,  it  is  important  to  examine  this  aid, 
and  any  modification  applied  to  it,  on  various  populations. 

One  important  population  is  the  military  one,  whose  members  are 
potential  users  of  cognitive  aids.  This  population  is 
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eharacterized  by  a  unique  way  of  thinking,  and  -furthermore, 
these  tasks  are  usually  performed  under  conditions  o-f  stress. 

One  purpose  of  the  present  study  Is  to  test  the 
applicability  of  the  tutorial  aid,  employed  by  Lichtenstein  fc 
MacGregor  (1985),  to  Israeli  university  students.  The  second 
purpose  is  to  modify  this  tutorial  in  order  to  obtain  the  above 
goals.  An  additional  purpose  is  to  test  the  effectiveness  of 
the  modified  tutorial  on  Israeli  military  population, 
especially  under  time  stress  conditions. 

The  basic  concept  underlying  this  modification  is  Paivio’s 
dual  coding  hypothesis.  According  to  this  hypothesis,  learning 
involve  both  mental  images  and  verbal  processes,  operating 
simultaneously  (Paivio,  1971).  The  tutorial,  used  by 
Lichtenstein  &  MacGregor  (1985),  verbally  explained  the  base- 
rate  problem  and  the  way  of  solution.  This  natural  language 
mediation  is  a  verbal  strategy  for  learning  process.  Imagery 
can  be  used  as  a  non-verbal  strategy  for  learning  process. 

That  is,  images  can  be  used  as  a  way  of  organizing  verbal 
items.  This  method  is  an  effective  mode  of  learning  (Adams, 
1976).  In  the  modified  tutorial,  the  material  is  presented 
both  verbally  and  by  images.  This  is  called  Training  by  Mental 
Image  (TbMI).  The  actual  images  used,  are  based  on  the  concept 
of  representation  by  Ven  Diagrams.  The  application  of  images 
can  contribute  to  a  better  understanding,  by  turning  the 
somewhat  abstract  situation,  described  in  base-rate  problems, 
into  a  more  concrete  and  clear  one. 

An  effective  aid  is  one  that  improves  intuitive 
performance,  and  gives  new  cognitive  skills.  Such  an  aid 
should  be  effective  regardless  of  variations  of  content  and 
structure.  One  content  factor,  which  is  another  source  of 
emotional  stress,  that  can  influence  performance,  is  risk.  For 
example,  in  solving  base-rate  problems,  the  two  types  of 
information  are  interpreted  and  their  relevance  is  determined. 
If  a  problem’s  content  indicate  some  risk,  it  may  influence 
these  interpretations  and  the  Judgement  of  relevance.  This  may 
cause  one  type  of  information  to  be  Judged  as  more  salient  and 
thus,  the  availability  heuristic  may  be  used.  That  is,  the 
more  salient  type  of  information  will  be  Judged  as  more 
relevant,,  and  therefore  will  be  the  only  basis  for  assessing 
the  required  probability.  However,  If  the  aid  is  effective, 
the  salience  of  one  information  would  not  influence  the 
conceptualization  of  the  problem,  and  the  assessed  probability. 
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The  effectiveness  of  the  modified  tutorial ,  will  also  be 
examined  when  the  problem  content  Indicates  various  levels  of 
risk.  It  is  hypothesized  that  if  this  aid  is  effective,  it 
will  remove  the  influence  of  the  risk  elements. 


Experiment  III 

Experiment  III  is  a  partial  replication  of  Lichtenstein  & 
MacGregor,  (1985).  Of  the  four  experimental  conditions 
employed  in  the  original  study,  only  the  algorithm  and  the 
tutorial  condition  were  used  in  the  present  one.  This 
conditions  were  selected,  since  they  were  found  to  be 
effective.  The  purpose  of  Experiment  III  was  to  test  the 
effectiveness  of  these  aids  in  solving  base-rate  problems,  on 
Israeli  student  population. 

Method 

Subjects.  Sixty  students  participated  in  Experiment  Ill. 
The  respondents  were  recruited  from  the  Tel-Aviv  University 
Introductory  Psychology  subject  pool. 

Base-Rate  problems.  The  experiment  employed  the  Light  Bulb 
and  Dyslexia  problems  used  by  Lichtenstein  &  MacGregor  (1985). 
The  problems  are  presented  in  Appendix  D.  All  aspects  of  the 
problems  were  accurately  translated  into  Hebrew. 

Experimental  design.  Two  independent  variables  were 
manipulated  in  this  experiment:  Aid  type  and  Problems”  type. 

Type  of  aid.  In  the  first  condition,  subjects  were  aided 
by  an  algorithm  in  answering  the  first  base  rate  problem.  This 
condition  is  denoted  AL.  In  the  second  condition,  subjects 
were  aided  by  a  tutorial.  This  condition  is  denoted  TU.  Both 
the  algorithm  and  the  tutorial  were  adopted  from  Lichtenstein  & 
MacGregor  (1985).  The  algorithm,  and  the  tutorial  (Light  Bulb 
version)  reoorted  in  the  original  study,  were  translated  to 
Hebrew.  A  corresponding  algorithm  was  composed  for  the 
Dyslexia  version.  The  algorithm  and  the  tutorial  are  presented 
in  Appendix  E. 

Problems”  type.  As  in  the  original  study  (Lichtenstein  & 
MacGregor),  in  one  condition,  subjects  received  the  Light  Bulb 
problem  as  a  training  problem.  In  the  other  condition, 
subjects  received  the  Dyslexia  problem  as  a  training  problem. 
Both  problems  were  of  similar  structure,  but  the  diagnostic 


-29- 


inf  ormation  in  the  Light  Bulb  problem  related  only  to  the 
probability  of  correct  diagnosis,  while  in  the  Dyelexia 
problem!  it  related  aleo  to  the  probability  of  incorrect 
di agnosia.  For  all  the  subjects  the  first  problem,  used  as 
training,  was  given  with  either  the  algorithm  or  tutorial  as  an 
aid.  The  second  problem,  used  as  a  generalization  problem,  was 
presented  without  an  aid. 

The  experimental  design  with  two  independent  variables, 
which  make  up  4  deffrent  groups,  is  presented  in  Table  IB. 

Table  18i  The  experi mental  Desi an  of  ExDari.m*nt  . Ill 


^-«-^TYPE  OF  AID 
TRAINiNG^^^ 

ALGORITHM 

TUTORIAL 

LIGHT  BULB 

DYSLEXIA 

The  dependent  variables.  Three  dependent  variables  were 
measured  in  this  experiments  response  mode,  and  the  degree  of 
confidence  and  reasonableness  for  the  training  problem. 


Response  mode.  Subjects"  probability  assessments  for  each 
problem  were  classified  as  follows} 


“Correct" 


If  subject’s  answer  was  equal  to  the  normative 
solution  according  to  Bayes’  Theorem. 


“Diagnostic" 


If  subject’s  answer  was  equal  to  the  diagnostic 
value  given  in  the  problem. 


"Base-Rate" 


If  subject’s  answer  was  equal  to  the  Base-Rate 
value  given  in  the  problem. 


"Conditional  " 


If  subject’s  answer  was  equal  to  the  Base-Rate 
value  multiplied  by  the  diagnostic  value. 


"Other" 


If  subject’s  answer  was  not  equal  to  any  of  the 
above  values. 


Degree  of  confidence.  After  completing  the  training 
problem,  the  subjects  had  to  rate  the  degree  of  confidence  they 
had  in  the  accuracy  of  their  responses,  on  a  scale  of  1  to  7, 
ranging  from  "not  at  all  confident"  <1>  to  "very  confident" 

<7) . 
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Reasenab lanes*.  The  respondents  answered  "yaa"  or  "no11,  to 
the  question  "Does  the  anawer  you  have  reached  seem  reasonable 
to  you?".  If  they  answered  "no",  subjects  in  the  algorithm 
group  were  also  asked  to  provide  a  reasonable  answer. 

Procedure.  The  subjects  were  randomly  assigned  to  four 
groups  of  15  students.  They  were  run  in  groups  of  3  to  5 
people  in  small  classroom  at  the  university.'  During  the 
sessions,  pocket  calculators  were  supplied,  in  order  to  avoid 
arithmetic  errors,  which  might  affect  the  assessed 
probabilities,  in  solving  both  problems.  The  experimental 
sessions  were  the  same  for  all  the  groups.  Subjects  first  read 
the  instructions,  and  then  solved  the  training  problem  (Light 
Bulb  or  Dyslexia)  with  the  aid  of  the  tutorial  or  the 
algorithm.  All  materials  used  in  the  performance  of  the 
training  problem  were  then  collected  and  subjects  were  asked  to 
work  on  the  generalization  problem  (Light  Bulb  or  Dyslexia) 
without  any  aid. 

Results 

Response  categories 

The  distributions  of  subjects’  response  mode  in  both 
problems  whether  aided  or  unaided,  are  shown  in  Table  17. 

Table  19s  Frequencies  and  Proportions  of  Subjects’ _ Response 
Mode  for  Training  and  generalization  problems 


"S\PR0BLEM 

RESPONSE^ 

TRACING 

PROBLEM  | 

-GENERALIZATION  1 

ALGORITHM 

TUTORIAL 

ALGORITHM 

.?  TUTORIAL  ;  k 

CORRECT 

10 

35.71% 

11 

38.23% 

16 

56.17% 

3 

10.34% 

* 
O  O 

1 

3.57% 

3 

10.34% 

10 

34.48% 

9 

32.14% 

9 

32.14% 

3 

10.34% 

CONDITIONAL 

0 

0% 

0 

0% 

i 

3.44% 

6 

20.68% 

Mm 

9 

32.14% 

m 

wem 

■H 
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The  data  in  Table  19  show  that  the  proportion  of  "correct" 
responses  for  the  training  problem  is  almost  the  same  for  both 
A1  (39*/.>  and  TU  (36%)  groups.  On  the  other  hand,  when 
considering  the  generalization  problem,  this  proportion  is  much 
higher  for  the  TU  group  (56%)  than  for  the  AL  group  (10%).  For 
both  aid  types,  the  proportion  of  "correct"  responses  is  higher 
for  the  Light  Bulb  problem  than  for  the  dyslexia  problem,  when 
given  as  training  problems.  When  given  as  generalization 
problems,  this  proportion  is  equal  for  the  group  aided  by  the 
algorithm.  For  the  tutorial  group  this  proportion  is  much 
higher  for  the  Light  Bulb  problem  than  for  the  Dyslexia  one. 

A  chi-square  test  performed  on  the  response  frequencies  for 
the  training  problem,  failed  to  reach  significance.  A 
significant  effect  was  obtained,  however,  for  the 
generalization  problem  (chi-squareel6.31 ,  df=4,  pC.Ol). 

Confidence 

The  mean  confidence  ratings  are  shown  in  Table  20,  for  all 
subjects  in  each  aiding  condition  and  for  each  base-rate 
problem  (across  groups). 

Table  20:  He.ans  of  Confidence  Ratings  for  Training  Problems 


- GROUP 

problem'—— - - 

TUTORIAL 

algorithem; 

ACROSS  GROUPS 

LIGHT  BULB 

3.33 

5.27 

4.3 

DYSKEXIA 

3.73 

4 

3.86 

ACROSS  PROBLEMS 

4.66 

3.35 

4.8 

Table  20  indicates  that  subjects  reported  higher  confidence 
in  the  accuracy  of  their  answers  to  the  Light  Bulb  problem  than 
of  the  answer  to  the  Dyslexia  problem.  Subjects  in  the  TU 
groups  reported  higher  confidence  than  those  in  the  AL  groups. 


A  t-test  performed  on  the  difference  between  confidence 
ratings  of  the  two  groups  was  found  to  be  significant  (t**2.45 
df=57  p<.01).  The  difference  between  confidence  ratings  fir 
Light  Bulb  and  Dyslexia  problem  failed  to  reach  significance. 
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RBasaqablengaa 

The  proportions  of  "yes"  end  “no"  answers  to  the  question 
"Does  the  answer  you  have  reached  seem  reasonable  to  you?"  are 
shown  in  Table  21. 


Table  21  *  Proportions  of  Reasonable  and  Unreasonable 
Responses  for  Training  problems 


AID 

LIGHT  BUL0 

°  % 

DYSLEXIA 

BOTH 

PRBLEMS 

RESPONSE^ 

BB 

ESS 

TUTORIAL 

TOTAL 

ALGORITHM 

teg 

V-  NO 

7 

43.66% 

2 

14.28% 

9 

31% 

5 

35.71% 

1 

8.33 

6 

23% 

12 

41.37% 

3 

11.53% 

IS 

27% 

YES 

8 

53.33% 

ns 

E99 

20 

78% 

17 

58.63% 

23 

88.46% 

40 

73% 

The  proportion  of  subjects  reporting  of  reasonable  answers 
(answered  "yes"  to  the  above  question)  was  higher  for  the 
Tutorial  group  (41%)  than  the  Algorithm  group  (12%).  A  chi- 
square  test  on  these  data  was  significant  (chi-square*=4. 74, 
df-1,  p< . 05) . 


Discussion 

Experiment  III  examined  the  effectiveness  of  the  algorithm 
and  tutorial  aid  in  solving  base-rate  problems,  on  an  Israeli 
students  sample.  The  results  showed  that  both  the  algorithmic 
aid  and  the  tutorial  aid  were  effective,  as  direct  aid,  and  led 
to  higher  proportion  of  correct  responses.  However,  the 
tutorial  aid  led  to  essential  change  in  thinking  and  in  the  way 
subjects'  conceptualized  the  problems,  as  manifested  by  the 
generalization  observed  under  this  condition.  The  poor 
generalization  found  for  the  algorithmic  aid,  indicates  that 
this  aid  was  technical  and  did  not  give  the  subjects  new 
cognitive  skills  with  which  to  examine  the  base-rate  problem. 

This  is  also  supported  by  the  degree  of  confidence  and 
reasonableness,  reported  under  the  two  aiding  conditions. 
Subjects  aided  by  the  tutorial  reported  higher  degree  of 
confidence  than  those  aided  by  the  algorithm.  The  proportion 
of  subjects  reporting  of  reasonable  answers  was  higher  for  the 
subjects  aided  by  the  tutorial  than  for  those  aided  by  the 
algorithm. 


algorithm. 

An  interaction  existed  between  the  aiding  method  and 
problem  type  which  determined  training  effectiveness.  The 
tutorial  aid  was  more  effective,  in  solving  the  Light  Bulb 
problem,  than  in  solving  the  Dyslexia  one.  Had  this  difference 
between  the  two  generalization  problems  been  observed  under 
both  aiding  conditions,  it  would  have  indicated  that  this  was  a 
result  of  the  different  content  and  structure  of  the  problems. 
Since  this  is  not  the  case,  it  may  indicate  that  the 
explanation  was  not  clear  enough  and  had  only  limited 
contribution  to  the  understanding  of  the  situation  described  in 
base-rate  problems. 


Experiment  III  examined  the  effectiveness  of  the  algorithm 
and  tutorial  for  aiding  Israeli  students  in  solving  base-rate 
problems.  The  results  showed  that,  although  both  aids  were 
effective,  only  the  tutorial  led  to  generalization  of  the 
correct  way  of  solution.  Experiment  IV  was  designed  to  further 
develop  this  aid  by  introducing  mental  images.  This  was  tested 
on  military  population,  using  base-rate  problems  of  military 
content,  and  under  normal  and  time  stress  conditions. 

tlgthpd, 

Subjects.  Two  hundreds  twenty-two  IDF  maintenance  junior 
officers  participated  in  Experiment  IV.  The  subjects  have  had 
secondary  education. 

Base-Rate  Problems.  All  the  subjects  were  asked  to  solve  6 
different  Base-Rate  problems.  The  problems  are  presented  in 
Appendix  F.  All  problems  were  similarly  structured.  Each 
contained  base-rate  information  about  a  certain  phenomenon, 
information  about  the  degree  of  accuracy  of  a  given  diagnostic 
device,  ad  a  question  concerning  the  accuracy  of  a  particular 
diagnosis.  Subjects  were  to  answer  in  terms  of  a  probability 
or  a  percentage  as  discussed  below. 

The  Color  Blindness  problem  was  used  for  illustration,  the 
Parachute  and  Jaundice  problems  were  used  as  training,  and  the 
Missile,  Seals  and  Masks  problems  were  generalization  problems. 

The  three  generalization  problems  were  of  a  military 
content,  which  was  relevant  to  the  subject  pool,  and  were 
presented  in  three  different  forms  specifying  different  levels 
of  risk:  neutral,  general  and  personal  (see  Appendix  F> .  The 


Missile  problem  reed  ss  followsi 

Intelligence  sources  revealed  that  a  hostile  Army  had 
purchased  sophisticated  G-7  anti  aircraft  missiles. 

Israeli  industry  had  developod  a  special  device, 
capable  of  receiving  the  signals  broadcasted  from  anti 
aircraft  missiles  that  enabled  identification  of  missile 
type.  The  device  is  known  to  be  accurate  in  80%  of  the 
cases,  that  is,  a  G-7  missile  type  and  missiles  of  "other 
types"  will  be  correctly  identified  as  sueh  in  80%  of  the 
cases.  Knowledge  of  the  exact  type  of  missile,  improves 
the  defense  profile  an  aircraft  flying  in  a  bound  area. 

Researches  done  by  the  Tactical  Warfare  Development 
Committee  show  that  the  chances  for  launching  a  G-7  type 
missile  is  10%. 

"An  anti  aircraft  missile  has  been  sent  to  a  certain  are 
and  was  identified  by  the  device  as  being  of  type  G-7". 

What  are  the  chances  that  this  missile  is  really  a  type  G-7 
missile? 

Experimental  design.  There  were  three  independent 
variables]  Provision  of  an  aid,  time  restriction  and  levels  of 
risk. 

The  TbMI  Aids  In  the  aided  condition,  (denoted  A)  the 
subjects  were  presented  with  a  tutorial,  which  was  a  modified 
version  of  the  one  used  by  Lichtenstein  8<  MacGregor  (1985). 

The  modification  involved  changing  only  the  mode  of 
presentation  of  the  problem  situation,  not  the  method  of 
calculation  (which  was  an  expansion  of  the  explanation  of  base- 
rate  problems  given  by  Beyth-Marom,  Dekel ,  Gombo,  and  Shaked, 
1985).  The  Tutorial  (shown  in  Appendix  G>  contained  a  detailed 
analysis  of  the  situation  described  in  the  Color  Blindness 
problem,  represented  pictorially.  A  "tree"  structure  was  used 
as  the  method  of  focusing  the  subject  on  the  target  sub¬ 
populations.  (i.e.,  those  who  were  diagnosed  as  color  blind, 
and  of  those  diagnosed  as  such,  those  who  were,  in  fact,  color 
blind).  The  tutorial  was  accompanied  by  verbal  explanation  and 
slides.  This  was  'followed  by  the  two  training  problems 
(Parachute  and  Jaundice).  After  completing  the  first  training 
problem,  these  subjects  were  given  feedback  by  showing  them  the 
correct  solution.  In  the  unaided  condition,  (denoted  UA)  the 
subjects  read  a  short  essay  discussing  general  statistical 
subjects. 


Risks  In  tbs  neutral  condition  (denoted  NR),  tbs  subject* 
haJ  to  solvs  generalization  problems  describing  situations  of 
neutral  risk.  By  "Neutral  risk"  is  meant  that  in  the  problem 
it  was  not  indicated  of  any  danger  to  the  reader  himself  or  to 
relevant  others.  For  examples 

"An  anti  aircraft  missile  has  been  sent  to  a  certain  are 
and  was  identified  by  the  device  as  being  of  type  G-7". 

In  the  general  risk  condition  (denoted  6R) ,  the  subjects 
had  to  solve  generalization  problems  specifying  general  risk. 

By  "General  Risk"  is  meant  that  in  the  problem  a  hint  of 
potential  danger  to  some  one  other  than  the  reader  hi m/her self 
was  given.  For  examples 

"An  anti  aircraft  missile  has  been  launched  to  a  certain 
area  where  only  Israeli  aircraft  fly,  and  was  identified 
by  the  device  as  being  of  type  G-7". 


In  the  personal  risk  condition  (denoted  PR),  the  subjects 
were  given  generalization  problems  specifying  personal  risk. 

By  "Personal  Risk"  ira  meant  that  the  problem  described 
situations  endangering  the  reader  himself.  For  examples 

"Suppose  you  are  a  pilot  flying  an  Israeli  aircraft.  An 
anti  aircraft  missile  that  had  been  launched  to  the  area 
where  you  are  flying  was  identified  by  the  device  as 
being  of  type  G-7". 

Time  restrictions  There  were  two  time  restriction 
conditions.  In  the  unlimited  time  condition  (denoted  NTL) ,  the 
subjects  were  to  solve  each  generalization  problem  without  any 
time  restriction.  In  the  limited  time  condition  (denoted  TL) , 
the  subjects  had  to  solve  each  generalization  problem  within  4 
minutes.  The  4  minute  limit  was  determined  by  running  the 
NTL/A  groups  first,  and  measuring  the  time  required  to  solve 
each  one  of  the  generalization  problems.  The  time  limit  for 
the  "time  restricted"  groups  was  chosen  by  taking  the  lowest 
time-to-solution  with  the  highest  frequency,  provided  that  two 
or  more  subjects  achieved  that  time-to-solution. 

The  experimental  design  with  three  independent  variables, 
which  make  up  12  groups,  is  presented  in  Table  22. 
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Table  22 1  The  Experimental  Design  gf^£mejiijatDl_J^' 


* 

w 

a 


^/UNLIMITED. 
'  LIMITED 
NEUTRAL 
GENERAL 
PERSONAL 


Z 


0 


AIDED  UNAIDED, 

AID 

The  dependent  variables.  Four  dependent  variables  were 
measured  in  this  experiments  response  mode,  the  degree  of 
confidence  in  the  accuracy  of  their  answer,  reasonableness  and 
subjective  mental  load. 

Response  mode.  The  answers  to  each  generalization  problem 
were  classified  in  the  same  manner  as  done  in  Experiment  III. 

Confidonce.  The  degree  of  confidence  in  the  accuracy  of 
each  response  was  rated  in  the  same  manner  as  in  Experiment 
III. 


Reasonableness.  As  in  Experiment  III,  the  respondents 
answered  "yes"  or  "no",  to  the  question  "Does  the  answer  you 
have  reached  seem  reasonable  to  you?". 


Subjective  mental  load.  After  completing  all  three 
generalization  problems,  subjects  filled  the  subjective  mental 
load  questionnaire,  used  in  Experiments  I  and  II. 

Procedure.  The  subjects  were  randomly  assigned  to  12 
groups  of  15  to  20  officers,  each  was  run  separately  in  small 
class  rooms  at  an  army  base.  The  instructions  preceding  each 
session  indicated  that  the  purpose  of  the  study  was  to  examine 
the  ways  in  which  people  solve  various  problems.  It  was 
emphasized  that  participation  in  the  experiment  was  anonymous, 
and  that  subjects'  performance  would  not  affect  their  career. 
During  the  sessions,  pocket  calculators  were  supplied  for  use, 
in  order  to  avoid  arithmetic  errors  in  solving  the  problems. 
The  experimental  sessions  were  the  same  for  all  the  groups. 
Subjects  were  given  a  brief  introduction  to  the  study  and  then 
filled  in  a  standard  form  containing  details  such  as  age,  sex, 
months  of  service,  current  job,  command  experience,  education 
and  the  like. 


yKuvjvuvifw 
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All  subjects  were  first  administered  the  Color  Blindness 
problem.  After  solving  this  problem  the  subjects  read  the 
tutorial  or  the  general  essay,  accompanied  by  the  verbal 
presentation  and  the  slides.  This  was  followed  by  the  training 
problems.  During  training,  the  subjects  who  read  the  tutorial 
were  provided  with  feedback,  and  were  allowed  to  ask  questions. 
All  material  used  in  the  performance  of  the  above  problems  were 
then  collected,  and  subjects  were  asked  to  work  on  the 
generalization  problems,  again  unaided.  At  this  point,  the 
risk  and  time  restriction  conditions  were  manipulated,  without 
prior  notice,  i.e.,  subjects  were  not  told  at  the  beginning  of 
the  experimental  session,  that  they  would  be  time-restricted 
later.  The  respondents  final  task  was  to  fill  in  the 
subjective  mental  load  questionnaire. 


V al  i d i t v  of  Time  Stress  .man Ipu Lat i o_n 

The  time  stress  manipulation  was  validated  based  on 
subjects'  subjective  ratings  of  time  stress  (Appendix  B,  item 
5).  The  mean  ratings  and  s.  d.  for  all  the  groups  are 
presented  in  Table  23. 

Table  23:  Means  and  Standard  Deviations  pf  Tijup.  gtress 
Rating  (Upper  No."  Meant  Lower  No,  =  s.d._ 


N.  AID 

time\ 

AIDED 

UNAIDED 

TOTAL 

UNLIMITED 

1.81 

1.45 

1.61 

1.53 

1.14 

1.34 

LIMITED 

2.78 

1.58 

2.17 

'  •• 

1.62 

1.10 

1.50 

TOTAL 

2.31 

1.51 

1.B9 

1.64 

1.12 

1.44 

Table  23  shows  that  the  mean  rating  for  all  the  TL  groups 
across  the  "aid"  and  "risk"  conditions  is  higher  than  the  mean 
rating  for  the  NTL  groups. 


An  analysis  of  varianc*  performed  on  those  data  showed  a 
significant  main  effect  for  time  limit  (F(i, 192>«7.SS,  p<.01). 
The  aid  x  time  limit  interaction  was  also  found  to  be 
significant  <F<1, 192>«5.06) ,  p<.01>.  This  interaction  is  shown 
in  Figure  1. 
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Figure  le  Mean  Time  Stress  Batinas,  as  Fun ctisn_of -Time 
Restriction  Jind- Aiding  Conditions 
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A  and  UA  groups.  The  distribution  of  subjects'  response 
mode  for  each  generalization  problem  is  shown  in  Table  24. 

Table  24  shows  that  in  the  UA  groups,  the  most  frequent 
response  category  is  the  "diagnostic"  one  (56*/.  for  Missile,  53*/. 
for  Seals  and  497.  for  Masks),  while  the  proportion  of  "correct" 
responses  is  always  0.  In  the  A  groups  the  most  frequent 
category  is  the  "correct"  one  (4S7.  for  Missile,  627.  for  Seals, 
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and  65*/.  -for  Masks)  ,  while  the  proportion  of  “diagnostic" 
responses  is  very  low.  Note  that  the.  proportion  of 
"conditional"  responses  in  the  A  groups  (21%  for  Missile,  20% 
for  Seals  and  23%  for  Masks)  decreased  relative  to  the  UA 
nrouQS. 

Table  24t  Frequencies  and  Proportions  of  Subjects’  Response 
Mode  for  Each  general ization  Problem 


'>snsproblem 

RE$P0NSEsnv 

MISSILE  1 

SEALS 

MASKS  1 

UNAIDED 

AIDED 

UNAIDED 

AIDED 

UNAIDED 

CORRECT 

53 

48.18% 

0 

0% 

67 

62.03% 

0 

0% 

66 

65.34% 

0 

0% 

DIAGNOSTIC 

3 

2.72% 

61 

69.31% 

5 

4.62% 

59 

53.15% 

39 

54 

49.09% 

BASE  RATE 

0 

0% 

5 

5.68% 

2 

1.85% 

3 

2.7% 

4 

3.96% 

5 

4.54% 

CONDITIONAL 

4 

3.36% 

23 

26.13% 

3 

2.77% 

22 

19.81% 

1 

0.99% 

25 

22.72% 

OTHER 

50 

48.45% 

20 

22.72% 

31 

28.7% 

27 

24.32% 

23 

22.77% 

26 

23.63% 

A  chi-square  test  on  these  data  indicated  a  significant 
difference  between  the  response  distributions  of  the  A  groups 
and  the  UA  groups  for  Missile  (chi  -square*=136. 78,  dfs4, 
p<.01),  Seals,  <chi-square»127.46,  df=4,  p<.01),  and  Masks 
<chi-square®124. 50,  df»«4,  p<.01>. 

The  proportion  of  correct  responses  across  all  three 
generalization  problems  was  computed  for  each  subject.  This 
proportion  is,  on  the  average  .57  for  all  the  A  groups  and  0 
for  all  the  UA  groups.  Analysis  of  variance  on  thase  data 
showed  a  significant  main  effect  for  aid  type  <F <  1 , 162) *=248. 858 

p<.01) . 

NTL/A  and  NTL/UA  groups.  The  distribution  of  subjects' 
responses,  with  regard  to  the  NTL  groups,  to  each 
generalization  problem  is  shown  in  Table  25. 

The  data  in  Table  25  show  that  the  proportion  of  "correct" 
responses  is  higher  for  the  A/NTL  groups  than  for  the  UA/NTL 
groups,  and  the  proportions  of  "condi tional "  responses  in  the 
A/NTL  groups  decreases  relative  to  the  UA/NTL  groups. 


B 


A  chi-square  test  performed  on  these  data  indicated  a 
significant  difference  between  response  distributions  of  the  A 
NTL  groups  and  the  UA  NTL  groups  for  Missile  (ehi-square*76.B5, 
df=4,  p<.01>,  Seals  (chi ~square=69. 32,  rlf«=4,  p<.01)  and  Masks 
<chi-square=!67. 64,  df>a4,  p<.01>. 

The  distribution  of  subjects'  responses,  when  considering 
the  TL  groups,  to  each  generalization  problem  is  shown  in  Table 
26. 
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11 

22.91% 
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11.11% 
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Again,  the  data  in  Tabla  26  indicate  that  tha  proportion  of 
"correct"  responses  is  higher  for  the  A/TL  groups  than  for  the 
UA/TL  groups,  and  the  proportions  of  "conditional"  responses  in 
the  A/TL  groups  decreases  relative  to  the  UA/TL  groups. 

A  chi-square  test  on  these  data  indicated  a  significant 
difference  between  response  distributions  of  the  A  groups  and 
the  UA  groups  for  Missile  <chi-square>*61. 13,  df»4,  p<.01), 

Seals  <chi-square*63. 03,  df»4,  pC.Ol)  and  Masks  tchi- 
square=57. BB,  df“4,  p<.01>. 

Risk.  The  distribution  of  subjects'  response  mode,  in  the 
various  risk  levels  groups,  to  each  generalization  problem  is 
shown  in  Tables  27  to  29. 


Table  27i 


•  lit i *  ‘  I  iii  Li  if>v  ♦.#=••*9' 
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Table  28i  Frequencies and  Proportions  of  Subjects*  Response 
Page  for  Easb  ,B,gngr.iUiJiiifln-,Frfll?lim -for  fieoaial 
RitK  Grpuet 


Table  29:  Frequencies... and  Prpporti.ons_of  Subjects*  .Response 
Mode  for  Each  generalisation  Problem  ■for.  Personal 
Risk  Group*. 


>VVsPROBLEM 

MISSILE  1 

SEALS  | 

-MASKS  I 

response\ 

AIDED 

UNAIDED 

AIDED 

UNAIDED 

AIDED 

UNAIDED 

16 

0 

23 

□H 

25 

0 

CORRECT 

41.02% 

0% 

60.52% 

EH 1 

67.56% 

0% 

1 

20 

mm 

21 

2 

17 

DIAGNOSTIC 

2.56% 

55.55% 

55.26% 

5.4% 

44.73% 

pm 

1 

El 

■mm 

BASE  RATE 

Ul  1 

E 

will 

2.63% 

rm 

i 

9 

2 

8 

0 

10 

2.56% 

25% 

5.26% 

21.05% 

0% 

26.31 

21 

4 

9 

8 

7 

9 

|  OTHER 

53.84% 

11.11% 

23.68% 

21.05% 

18.42% 

23.68% 

As  -for  Tims  limit  and  Aid,  the  data  in  Tables  27  to  29 
indicate  that  the  proportion  of  “correct"  responses  is  higher 
•for  the  A  groups  than  for  the  UA  groups,  and  the  proportions  of 
“conditional"  responses  in  the  UA  groups  decreased,  for  all 
risk  levels. 

A  chi-square  test  on  the  data  for  each  risk  level, 
indicated  a  significant  differences  between  response 
distributions  of  the  A  groups  and  the  UA  groups  for  each 
generalization  problem.  The  test  results  are  summarized  in 
Table  30. 

Table  30s  Chi-Square  Tests  Results  of  Subjects »  Response 
Mode,  in  All  Generalization  Problems,  for  Each 

Risk  Le  ve  1  (Upper  No.«Chi -Square. . Mldd-lja.NQ,_^d,f 

Lower  No.  ^Si  an  if  i  &anc.gl 


RISK 

NEUTRAL 

GENERAL 

PERSONAL 

PROBLEM""^ 

K’  ■/ 

MISSILE 

43.57 

42.15 

54. .12 

3 

4 

4 

0.00 

0.00 

0.00 

43.39 

44.66 

42.68 

SEALS 

4 

4 

4 

0.00 

0.00 

0.00 

MASKS 

48.20 

32.95 

47.29 

4 

4 

4 

0.00 

0.00 

0.00 

The  Time  Limit  Manipulation 

This  manipulation  seemed  to  have  only  a  minor  effect  on 
response  distributions.  The  response  distribution  for  the  TL 
groups  and  the  NTL  groups  are  shown  in  Table  31.  A  chi-square 
test  comparing  the  responses  distribution  of  the  TL  time  and 
NTL  groups  failed  to  reach  significance,  except  for  Missile 
problem  <chi~square«10. 10,  df  =*  4,  p<.05>.  This  indicates  a 
greater  proportion  of  "correct"  and  "diagnostic"  responses  in 
the  NTL  group,  and  a  higher  proportion  of  "other"  responses  in 
the  TL  groups.  Significance  was  reached  for  the  following 
specific  comparisons: 


mrm mnm 


Mode  for  Each  < 

NTL  Srcue* 


i  Problem  -for  TL  and 


^-s^PROBLEM 

MIS 

SILE  1 

SEALS 

MASKS 

RESPONSE^ 

LIMITED 

UNLIMITED 

LIMITED 

UNLIMITED 

LIMITED 

UNLIMITED 

CORRECT 

H 

20 

18.7% 

37 

33.3% 

30 

27.8% 

35 

32.4% 

31 

30.1% 

DIAGNOSTIC 

■ 

29 

27.1% 

31 

27.8% 

33 

30.6% 

27 

25.0% 

34 

33.0% 

BASE  RATE 

4 

3.6% 

1 

0.8% 

3 

2.7% 

2 

1.8% 

3 

2.8% 

6 

5.8% 

CONDITIONAL 

m 

13 

12.1% 

16 

14.4% 

9 

8.3% 

19 

17.6% 

7 

6.8% 

OTHER 

m 

44 

41.1% 

24 

21.6% 

34 

31.5% 

m 

m 

For  the  Missile  problem  only  -  the  proportion  o f  "correct" 
responses  in  the  A  groups,  is  higher  for  the  NTL  groups  than 
the  TL  groups  (chi-square*s10.04,  df  =3,  p< .  05) . 

For  the  Seal  problem  only  -  in  the  A/NR  groups,  the 
proportion  of  "correct"  responses  is  higher  for  the  NTL  groups 
(chi -squares'll.  57,  df«*4,  p<.05). 

For  the  Masks  problem  only  -  in  the  UA  groups,  the 
proportion  of  "diagnostic"  responses  is  lower  for  the  NTL 
groups,  while  the  proportion  of  "conditional"  responses  is  much 
higher  (chi -square=8. 27,  dfs3,  p<.05).  In  addition,  in  the 
UA/GR  groups,  the  proportion  of  "diagnostic"  responses  is  lower 
for  the  NTL  groups,  while  the  proportion  of  "conditional" 
responses  is  higher  (chi -square“8. 76,  df*3,  p<.05). 

The  Risk  Manipulation 

The  risk  manipulation  was  found  to  have  no  effect  on 
response  distribution  under  any  of  the  conditions. 
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Conf  i  dence 

The  mean  confidence  ratings  and  s.  d.  -for  each 
generalization  problem  are  shown  in  Table  32. 

Table  32s  Means  and  s.d.  of  Confidence  Ratings  For  all 

Generalization  Problems  VJpper  NoJjfMe.#1aL ..Lowon 
NO.w  s.d.  ) 


PROBLEM 


RESPONSE^V^ 

AIDED 

AIDED 

BBSi 

LIMITED 

S.42 

5.02 

5.67 

5.02 

5.76 

5.10 

1.65 

1.57 

.89 

1.56 

1.00 

1.34 

UNLIMITED; 

5.17 

5.58 

6.17 

5.69 

6.29 

5.86 

1.51 

1.67 

1.33 

1.39 

1.64 

1.49 

Table  32  shows  that,  for  all  three  general ization  problems, 
the  mean  ratings  are  higher  for  the  NTL  groups  than  the  TL 
groups.  The  mean  ratings  are  higher  for  the  A  groups  than  for 
the  UA  groups,  for  the  missile  and  masks  problems.  For  the 
seals  problem,  the  mean  ratings  are  higher  for  the  UA  group 
then  the  A  group  under  time  stress  conditions.  With  regard  to 
the  risk  groups,  the  ratings  are  highest  for  the  NR  groups, 
followed  by  the  GR  and  then  the  PR  groups. 

Analyses  of  variance  on  these  data  for  each  generalization 
problem,  indicated  significant  main  effect  of  time  limit,  for 
the  Seals  (F ( 1 , 176) =B. 71 ,  p<.01>  and  Masks  <F ( 1 , 176) =10. 51 , 
pC.Ol)  problems.  A  significant  main  effect  of  aid  condition 
was  also  found  for  the  Beals  <F < 1 , 176) =B. 45,  p<.01)  and  Masks 
(F < 1 , 176) =7. 58,  p<«01)  problems.  The  two-way  interaction 
effect  of  time  limit  x  risk  was  significant  for  the  Seals 
problem  (F ( 1 , 176) =5. 23,  p<.01).  This  interactions  is  shown  in 
Figure  2. 
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An  overall  measure  of  confidence  was  obtained  by  averaging 
the  ratings  across  all  three  generalization  (  roblems.  The  mean 
confidence  ratings  and  s.  d.  for  the  various  groups  are  shown 
in  Table  33. 


Table  33s  cleans  and  s.d,  of  Confidence  Ratings  for  all 
Experimental  groups  (Upper  Mo.  “Means  L^wec. 
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PROBLEM 


* N€URTAL 


GENERAL 


PER  SON.  At  * 


TOTAL 


RESPONSE 


AIDED  sUNAJDED  I  AIDED 


AIDED  UNAIDED 


LIMITED 


UNLIMITED 


T  JTAL 


A-l 


flPPENDIX-.fi 

Estimation  problems 


VI.  How  much  food  <in  Kg.)  does  an  average  person  consume 
during  his  entire  lifetime. 


V2.  What  is  the  number  of  beds  in  all  the  general  hospitals  in 
Israel ? 


V3.  How  many  liters  of  water ,  for  home  usage,  are  consumed  in 
Israel  during  one  year? 


V4.  How  many  cars  are  onwed  by  the  Israeli  population? 


VS.  How  many  students  graduate  scondary  school  in  a  year? 


V'6.  How  many  airplanes  land  and  takeoff  (in  season)  in  one  day 
at  the  Ben-Gurion  Airport? 


77.  What  is  the  number  of  members  of  the  academic  stuff 
employed  in  Israeli  universities? 


VB.How  many  active  bus  drivers  are  employed  in  "EGED"? 


B-l 


flEEEMBJULfi 

Subjective  Mnt«l  load  questionnaire 

1.  What  was  the  dtgm  of  difficulty  you  encountered  in 
answering  the  questions? 

V«ry  Easy  i - 1 - i - 1 - l - i - 1  vary  difficult 


2.  How  much  thinking  effort  uai  required  to  answer  the 
quesitons? 

Little  Effort  ■  .  > _ i - 1 - 1 - 1 - 1  A  Lot  of  Effort 


3.  Was  answering  the  questions  tiring? 

Not  Tiring 

at  aj|  \ - 1 - 1  ■  -  . 1 — -J - >  Vary  Tiring 


4.  Was  answering  the  question  frustrating 
Not  Frustrating 

at  all  ‘ - 1 - 1 - 1 - 1 - 1 - *  Vary  Frustrating 


S.  Did  you  have  enough  time  for  answering  the  questions? 


J — - 1 - i 


Enough  Tima  t 


J  Not  Enough  Tima 
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APPENDIX  C 

Training  for  Using  Algorithmic  Decomposition 

In  many  aituationa  in  everyday  life  we  have  to  make  a  vast 
amount  of  decisions.  Each  decision  ia  usually  baaed  on  a  set 
of  data  concerning  different  aspects  of  the  specific  situation 
at  hand.  For  example,  a  commander  has  to  decide  how  to  deploy 
hie  forces  in  a  certain  area.  To  make  such  a  decision,  he 
needs  a  large  amount  of  data.  For  instance,  he  has  to  know  the 
number  of  soldiers  in  his  force)  how  many  arms  are  at  his 
disposal;  the  amount  of  available  ammunition;  what  are  the 
terrain  conditions;  what  is  the  enemy’s  troop  deployment;  etc. 

That  is  to  say  that  any  and  every  decision  must  be  made 
after  taking  into  account  the  answers  to  a  series  of  questions, 
some  of  them  quantitative.  Some  of  these  answers  are  readily 
available  and  easily  obtained,  e.g.  the  number  of  arms  and 
soldiers  in  the  commander’s  force.  Other  relevant  information 
can  be  obtained  through  various  military  services.  But  in  many 
cases  the  necessary  data  is  unavailable.  In  these  cases  we 
must  estimate  the  values. 

If  reliable  decisions  are  to  be  made,  then  th.»  estimates 
must  be  made  as  accurately  as  possible.  In  addition,  in  many 
decision  making  situations  the  time  factor  is  very  crucial,  and 
thus  the  estimates  must  also  be  made  as  quickly  as  possible. 

For  example,  if  the  commander  errs  in  underestimating  thn 
enemy’s  arms,  he  may  decide  on  a  force  deployment  that 
endangers  his  soldiers.  Similarly,  if  he  spends  too  much  time 
in  gathering  the  relevant  information  his  decision,  although 
correct,  may  be  made  too  late. 

Therefore,  it  is  important  that  he  use  a  method  which  will 
help  him  reach  the  most  precise  estimates  possible,  in  the 
shortest  possible  time.  One  of  the  possible  methods  is  the  use 
of  partial  knowledge  to  estimate  related  quantities,  and 
applying  this  to  generate  the  target  estimate. 

Research  has  shown  that  people  tend  to  make  lot  of  mistakes 
when  required  to  estimate  unknown  quantities.  To  be  more 
accurate  and  to  avoid  errors  one  has  to  use  an  efficient  method 
of  estimation. 

In  this  experiment  you  will  be  presented  with  a  method 
based  on  utilizing  partial  knowledge  or  sub-esti m* tes  of 
related  quantities.  The  method  involves  the  following  three 
elements: 
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1.  The  sub-di vision  of  the  target  question  into  a  number  of 

sub-questions. 

2.  Assigning  values  to  the  sub -quest ions. 

3.  Combining  these  values,  by  rule,  to  arrive  at  the  target 

answer. 

The  method  is  illustrated  using  the  following  example. 
Suppose  the  question  ist  What  is  tha  number  of  beds  in  a 
cer t ai  n  oener al  _hosD  i  t a  1  ? 

Since  it  is  unlikely  for  this  answer  to  be  known,  the 
correct  answer  must  be  estimated.  To  do  this  the  target 
question  must  first  be  divided  into  sub-questions  that  are 
related  to  the  target  answer.  For  examples 


1.  How  many  departments  are  there  in  a  general  hospital? 

2.  How  many  rooms  are  there  in  each  department? 

3.  How  many  beds  are  there  in  each  room? 

In  the  next  stage  we  can  try  to  answer  these  questions. 

The  approp  iate  values  may  be  available  or  more  easily 
estimated  than  the  target  value.  For  instance,  based  on  the 
knowledge  we  have,  we  can  count  the  names  of  various  department 
and  reach  an  accurate  estimate.  In  the  same  manner  we  can 
estimate  the  number  of  rooms  in  each  department  and  the  number 
of  beds  in  each  room. 

The  next  stage  is  to  define  a  rule  for  combining  the 
ious  values  we  have  reached  in  order  to  arrive  at  the  target 
estimate.  In  our  example,  the  rule  ist 

1.  Multiply  the  number  of  beds  in  each  room  by  the  number 
of  rooms  in  each  department.  The  result  is  the  number 
of  beds  in  each  department. 

2.  Multiply  the  number  of  beds  in  each  department  by  the 
number  of  departments  in  the  hospital. 

In  this  way  we  obtain  a  value  which  is  an  accurate  estimate 
of  th--  target  value.  This  value  may  be  not  identical  to  the 
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target  value,  tout  it  ie  probably  a  good  approximation  o-f  the 
real  value  and  more  accurate  than  any  guess. 

The  set  of  sub-questions  and  the  rules  for  combining  the 
estimated  values  are  called  Algorithm. 

There  are  many  ways  to  compose  an  algorithm.  For  example 
one  can  retrieve  from  one's  memory  partial  knowledge  which  is 
relevant  to  the  target  value,  and  define  a  Fule  for  combining 
these  pieces  of  knowledge.  It  is  also  possible  to  develop  a 
set  of  sub-questions  and  a  rule,  and  estimate  the  values 
required  by  the  sub-questions. 

Algorithms  can  involve  sub-estimates  of  values  that  are 
larger  or  smaller,  in  magnitude,  than  the  target  value. 
Algorithms  can  involve  integers  or  fractions  (proportions) , 
e.g.,  the  proportion  vs.  the  number  of  smokers  in  Israel. 

In  defining  an  algorithm  one  has  to  use  appropriate 
measurement  units,  for  example  the  distance  between  point  A  and 
point  Bis  best  estimated  in  Km.  than  in  Cm. 

It  is  advised  to  avoid  composing  very  long  algorithms, 
since  a  long  one  may  increase  the  error  in  the  target  value, 
and  lengthen  the  time  required  to  reach  it. 

The  above  method  can  be  applied  when  estimating  the 
following  quantity: 

How  many  cigarettes  are  manufactured  in  Israel  in  a  year? 

We  will  define  an  algorithm  that  will  aid  us  in  making  as 
accurate  an  estimate  as  possible.  Work  according  to  the 
fallowing  steges: 

a-  Locate  relevant  knowledge  domain. 

At  this  stage  we  have  to  locate  and  count  relevant  domains 
of  knowledge,  on  which  we  can  base  rules  of  calculations  in 
order  to  answer  the  target  questions.  We  therefore  have  to  find 
topics  for  which  we  have  available  knowledge.  For  example,  we 
can  define  rules  of  calculations  according  to  one  of  the 
following  domains  of  knowledge: 

1.  The  consumption  of  cigarettes  in  Israel. 

2.  The  production  capacity  of  cigarette  manuf acturers  in 
Israel . 
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3.  The  amount  of  cigarette*  sold  in  Israel. 


b.  Determine  the  knowledge  domain, _pn_Mhi ch  the  algorithm  will 

fag. 

At  this  stage  we  have  to  examine  each  domain  of  knowledge, 
and  decide  which  one  offers  the  most  available  information. 

For  example,  we  will  examine  the  amount  of  information  required 
for  estimating  how  many  cigarettes  are  manufactured  in  Israel 
in  a  year,  based  on  to  the  consumption  of  cigarettes  in  Israel. 
This  information  can  be  as  follows: 

1.  The  number  of  smokers  in  Israel. 

2.  The  number  of  cigarettes  consumed  by  each  smoker  in  a 
certain  period  of  time. 


Now  we  will  examine  the  amount  of  information  required  for 
this  estimation,  based  on  to  the  production  capacity  of 
cigarette  manufacturing  factories  in  Israel.  This  information 
can  be  as  follows: 

1.  The  number  of  cigarette  mariuf acturers  in  Israel. 

2.  The  production  of  each  factory  in  a  certain  period  of 
time. 

Finally,  we  will  examine  the  amount  of  information  required 
for  this  estimation,  based  on  the  amount  of  cigarettes  sold  in 
Israel.  This  information  can  be  as  fallows: 

1.  The  number  of  stores  selling  cigarettes. 

2.  The  amount  of  cigarettes  sold  in  each  store  during  a 
certain  period  of  time. 

Obviously  the  information  available  to  each  one  of  us  is 
different,  and  it  is  possible  that  one  may  prefer  to  compose 
algorithms  based  on  a  certain  knowledge  domain,  while  the  other 
may  prefer  another  domain.  It  is  likely  that,  for  most  of  us, 
the  most  available  and  accurate  information  of  all  three 
knowledge  domains,  discussed  above,  is  the  one  related  to  the 
consumption  of  cigarettes  in  Israel.  It  would  seems  easier  to 
estimate  the  number  of  cigarettes  consumed  by  each  smoker. 
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and  thus  tha  consumption  of  all  smokers,  than  to  count  all  tha 
stores  and  the  amount  of  cigarettes  sold  in  each  one. 

Therefore  we  will  choose  knowledge  domain  <i>i  Tha 
consumption  of  cigarettes  in  Israel. 

c.  Locate  a  basic  information  unit. 

One  we  choose  the  knowledge  domain  on  which  the  algorithm 
will  be  based,  we  can  start  composing  it.  First  we  must  locate 
a  basic  information  unit  that  will  serve  as  the  starting  point. 
This  unit  must  follow  the  following  criteria! 

1.  It  must  be  relevant  to  the  target  question. 

2.  It  can  be  assigned  an  estimable  value. 

3.  Its  value  can  be  changed  by  adding  new  information. 


A  basic  unit  can  be,  for  example,  the  average  amount  of 
cigarettes  consumed  by  one  smoker  per  day.  This  will  determine 
one  of  the  sub-questions  in  the  algorithm. 

d.  Co.mpoqe  thg  fllflOnUiUE 

Based  on  the  sub-question  defined  above,  we  will  now  define 
other  sub  questions,  each  referring  to  any  information  that  may 
bring  the  initial  value  closer  to  the  target  estimate.  Also, 
we  will  define  the  rules  of  calculations  according  to  which  the 
sub-estimates  are  combined. 

The  resulting  algorithm  may  be  as  follows: 

How  many  cigarettes  are  manufactured  in  Israel  in  a  year? 

a.  What  is  the  population  of  Israel? 

b.  What  proportion  of  the  population  smokes? 

c.  What  is  the  number  of  smokers  in  Israel? 

[Multiply  (a)  bv  <b)D. 

d.  How  many  cigarettes  does  the  average  smoker  consume  per 
day? 

e.  How  many  cigarettes  are  consumed  in  Israel  per  day? 
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C Multiply  (c)  by  <d>3. 
f.  How  many  days  are  there  in  a  yaar? 

0 .  How  many  cigarettes  are  conaumed  in  Israel  in  a  year? 
[Multiply  (e)  by  <f)3. 

a.  Dfl  thg.-MUlftU.Qll 

Now  we  will  try  to  make  the  sub-amt i mates  and  generate  the 
target  estimate.  Plaase  work  according  to  the  above  algorithm. 

After  you  have  completed  the  cigarettes  question,  consider 
the  following  question: 

How  many  Kg.  of  fish  are  caught  in  Israel  in  one  year? 

We  will  present  you  two  different  algorithms  for  making  this 
estimation.  Read  through  the  algorithms  carefully  and  decide 
which  one  is  more  effective  and  yield  a  more  accurate  estimate. 

Algorithm  1. 

a.  How  many  fishermen  are  there  in  Israel? 

b.  How  many  Kg.  of  fish  are  caught  by  one  fisherman  in  one 
day? 

c.  How  many  Kg.  of  fish  are  caught  by  all  the  fishermen  in 
Israel  in  one  day? 

[Multiply  <a>  by  <b>3 

d.  How  many  working  days  are  there  in  one  year? 

e.  How  many  Kg.  of  fish  are  caught  in  Israel  in  one  year? 
[Multiply  <c>  by  <d>3. 

Algorithm  2. 

a.  What  is  the  population  of  Israel? 

b.  How  many  Kg.  of  fish  are  consumed  by  one  person  in  one 
year? 

c.  How  many  Kg.  of  fish  are  consumed  by  the  entire 
population  in  one  year? 

[Multiply  <a>  by  (b>3 
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d.  What  is  ths  proportion  o-f  imported  -fish  of  all  fish 
consumed  by  the  entire  papulation  in  one  year? 


e.  How  many  Kg.  of  fish  are  imported  each  year? 

[Multiply  <c>  by  (d)3 

f.  How  many  Kg.  of  fish  are  caught  in  Israel  in  one  year? 
[Subtract  (e)  from  (c)3. 

Which  of  the  previous  two  algorithms  seems  more  effective, 
and  why?  _ 


When  you  have  finished  please  report  to  the  experimenter . 

Explanation t  From  among  the  two  algorithms  presented  above, 
the  more  effective  would  seem  to  be  Algorithm  2.  The  answers 
to  the  sub-questions  of  Algorithm  2,  are  more  available  than 
those  to  Algorithm  1. 

In  the  course  of  this  experiment  you  will  be  presented  with 
a  number  of  estimation  problems  similar  to  those  you  have 
solved.  You  are  required  to  solve  these  problems  according  to 
the  method  described  above. 

Before  proceeding  please  reread  the  method,  and  if  you  have 
any  questions  ask  the  experimenter . 
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APPENDIX  P 

Base-rate  problems  far  Experiment .III 


Thu  Light  Bulb  Problem 


A  light  bulb  -factory  uses  a  *carming  device  which  is  supposed 
to  put  a  mark  on  each  defective  bulb  it  spots  in  the  assembly 
line.  Eighty-five  percent  of  the  light  bulbs  on  the  line  are 
OKj  the  remaining  15%  are  defective. 

The  scanning  device  is  known  to  be  accurate  in  BOV.  of  the 
decisions,  regardless  of  whether  the  bulb  is  actually  OK  or 
actually  defective.  That  is,  when  a  bulb  is  good,  the  scanner 
correctly  identifies  it  as  good  807.  of  the  time.  When  a  bulb 
is  defective,  the  scanner  correctly  marks  it  as  defective  B07. 
of  the  time. 

suppose  someone  selects  one  of  the  light  bulbs  from  the  line 
at  random  and  gives  it  to  the  scanner.  The  scanner  marks  this 
bulb  as  defective. 

What  is  the  probability  that  this  bulb is, really  defective? 


The  Dyslexia  Problem 

Dyslexia  is  a  disorder  character i zed  by  an  impaired  ability 
to  read.  Two  percent  of  all  first  graders  have  dyslexia.  A 
screening  test  for  dyslexia  has  recently  been  devised  that 
can  be  used  with  first  graders.  The  screening  test  is  cheap 
and  easy  to  administer)  it  identifies  those  children  who  will 
later  be  given  a  more  extensive  test  to  determine  for  sure 
whether  the  child  has  dyslexia.  The  screening  test  is  not 
completely  accurate.  For  children  who  really  have  dyslexia, 
the  screening  test  is  positive  (indicating  dyslexia)  95%  of 
the  time.  But  it  also  gives  a  positive  (dyslexia)  result  for 
57.  of  the  normal  children,  the  ones  who  do  not  have  dyslexia. 

A  first  grader  is  given  the  screening  test  and  the  result  is 
positive,  indicating  dyslexia. 

What  is  the  probability  that  the,  child  really  has  Dyslexia? 
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APPENDIX  E 

Tutorial  (Liaht  bulb  version) 

Consider  the  -following  problem* 

A  cab  was  involved  in  a  hit  and  run  accident  at  night.  Two 
cab  companies,  the  Green  and  the  Blue,  operate  in  the  city. 

You  are  given  the  following  data: 

a.  90'/.  of  the  cabs  in  the  city  are  Green  and  10*/.  are  Blue. 

b.  A  witness  identified  the  cab  as  Blue. 

The  court  tested  the  reliability  of  the  witness  under  the 
same  circumstances  that  existed  on  the  night  of  the  accident 
and  concluded  that  the  witness  correctly  identified  each  one  of 
the  two  colors  707.  of  the  time  and  failed  ZOV.  of  the  time. 

What  is  the  probability  that  the  cab  involved  in  the 
accident  was  Blue  rather  than  Green? 

Research  has  shown  that  people  often  have  trouble  answering 
problems  like  this.  In  this  portion  of  today's  experiment,  we 
are  presenting  you  with  a  mini-tutorial  to  see  if  instruction 
will  help  you  solve  such  problems.  Please  read  through  the 
tutorial  carefully.  We  have  allowed  time  in  the  experiment  for 
you  to  do  that. 

Tutorial, 

The  class  of  problems  here  addressed  are  problems  for  which 
two  kinds  of  information  are  given  and  a  probability  is 
requested.  One  kind  of  information  is  about  the  population  or 
populations  in  question.  The  other  kind  of  information  is 
specific  to  the  case  at  hand. 

In  the  problem  given  above,  the  population  is  the 
population  of  cabs  in  the  city.  The  population  information  is 
tnat  90 7.  of  the  cabs  are  Green  and  10X  are  Blue.  The  specific 
information  concerns  the  specific  cab  that  was  involved  in  a 
hit  and  run  accident.  The  witness  said  that  the  specific  cab 
was  Blue.  But  we  also  know  about  this  testimony  that  the 
witness  is  not  perfectly  accurate.  The  witness  is  able  to 
correctly  identify  the  color  of  the  cab  70 '/•  of  the  time. 

The  way  most  people  usually  go  wrong  in  solving  these 


problems  is  that  they  concentrate  too  much  on  the  specific 
in-formation  and  tand  to  neglect  tha  dqduI ation  information. 
Mayba  tha  spacific  information  seems  mora  immediately  relevant 
to  them.  Or  perhaps  they  just  don't  know  how  to  go  about 
combining  the  information  to  produce  a  single  answer.  Hera  is 
a  way  of  doing  Just  thatc 

Bteol.  Draw  a  table.  Begin  by  drawing  a  "two-by-two" 
table,  that  is,  a  diagram  with  two  rows  and  two  columns,  like 
this: 


Step  2.  Label  the  table.  We'll  label  the  columns  for  the 
papulation  information.  The  population  is  cabs  in  the  city, 
which  are  either  Blue  or  Green.  The  rows  get  the  specific 
information,  that  is,  the  witness  testimony,  which  was  Blue — 
but  for  completeness,  we’ll  also  label  the  other  row  Green, 
because  tha  witness  coul d  have  said  Green.  No  now  our  table 
looks  like  this: 


Witness  said: 


Cabs  in  the  City 


Blue 

Green 

Blue 

cr 

i 

(0 

ID 

3 

Labeling  the  table  is  not  quite  as  simple  as  it  may  first 
appear.  Notice  that  the  sub-labels,  "Blue"  and  Green",  are  the 
same  for  the  rows  and  the  columns.  This  should  generally  be 
true  in  such  problems.  It  vould  be  a  mistake  to  label  the  rows 
according  to  whether  the  witness  was  accurate  or  inaccurate: 


Right 

Witness  said: 

Wrong 

The  problem  could  be  so.'.vnd  with  such  labeling,  but  not 
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using  the  method  we  ere  teaching  you  here.  In  general,  the 
sub-labels  are  the  two  possible  states  of  the  world.  The  main 
labels  (e.g.,  "Cabs  in  the  City"  and  "Witness  said:")  indicate 
the  source  of  information.  One  source  is  always  population 
information  (here,  the  relative  number  of  cabs*  in  the  city>| 
the  other  source  is  always  specific  information  (here,  what  the 
witness  said). 

Notice  that  if  there  were  numbers  in  the  four  cells  of  the 
table,  we  could  calculate  row  totals  and  column  totals  and  a 
grand  total  for  the  whole  table.  The  places  for  these  totals 
are  shown  below  with  dashed  lines. 


Cabs  in  the  City 

Row 


Blue 

Green 

Totals: 

Blue 

Witness  said: 

Green 

Column  Totals: 

Grand 
- Total 

Step  3.  Assign  an  arbitrary .orand  total.  *o  cat  started, 
we'll  fill  in  the  grand  total.  That  should  be  the  total  number 
of  cabs  in  the  city.  But  we  don't  know  how  many  cab*  there  are 
in  the  city.  So  we  pick  an  arbitrary  total  of  1,000.  We  could 
use  10  or  100  (or  any  other  number ,  but  using  1,000  will  make 
later  calculations  easier. 


Cabs  in  the  City 


Witness  saidt 


Blue 

Green 

Blue 

G  rcs.y 

1000 


Step  4.  Estimate  the  uopulation  totals.  If  there  were 
1,000  cabs  in  the  city,  how  many  of  them  would  be  Blue? 
According  to  the  story,  10*/.  are  Blue.  That  means  10  out  of 
every  100  or  100  out  of  every  1,000  are  Blue.  That  number. 
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100,  is  the  left  column  total.  The  rest  are  Green.  So  1,000  - 
100  ■  900  is  the  right  column  total.  We  put  these  column  totals 
into  the  tablet 


Cabs  in  the  City 


Witness  saidt 


Blue 

Green 

Blue 

Green 

100 

900 

1000 

WARNING.  The  method  we're  teaching  you  -for  solving  these 
problems  won't  work  if  you  start  out  estimating  the  wrong 
totals.  It's  important  in  this  step  to  correctly  identify 
which  part  of  the  problem  gives  pooul ati on  information  and 
which  gives  specific  information  that  does  not  indicate  any 
specific  case.  The  specific  information  fingers  a  particular 
case. 


Step  S.  Fill  in  the  cells.  Working  with  each  total,  divide 
it  among  its  two  cells.  First,  for  the  100  blue  cabs,  how  many 
would  the  witness  correctly  see  as  Blue,  and  how  many  would  the 
witness  incorrectly  see  as  Green?  The  story  states  that  the 
witness  is  correct  70 7.  of  the  time.  So:  100  x  .70  «=  70  is  the 
number  of  Blue  cabs  the  witness  would  correctly  call  Blue,  and 
the  remaining,  100-70  *  30,  are  the  number  of  Blue  cabs  the 
witness  would  incorrectly  all  Green. 

Now  consider  the  900  Green  cabs.  Again  the  witness' 
accuracy  is  70Xs  900  x  .70  *  630  is  the  number  of  Green  cabs 
the  witness  would  have  correctly  called  Green.  This  number, 
630,  r,oes  in  the  Green-Green  ceil.  The  rest  of  the  Green  cabs, 
900  -  630  «  270,  is  the  number  of  Green  cabs  the  witness  would 
have  incorrectly  called  Blue. 

Our  table  now  looks  like  this: 


Cabs  in  the  City 


Green 

Blue 

Witness  said: 

Green 

70  “ 

270 

30  ’ 

630 

100 

900 

1000 

E-S 


Comment.  Notice  that  we  now  could,  i-f  we  wished,  -find  the 
last  two  totals,  the  total  number  of  times  the  witness  would 
have  said  "Blue11,  rightly  or  wrongly: 

70  ♦  270  ■  340 

and  the  total  number  of  times  the  witness  would  have  said 
"Green",  rightly  or  wrongly: 

30  +  6 30  +  660 

These  totals  are  not  intuitively  obvious.  The  reason  is 
that  these  totals  are  the  total  number  of  times  the  witness 
says  “Green"  and  Blue".  What  the  witness  says  depends  not  only 
on  the  witness'  accuracy  but  also  on  the  relative  proportions 
of  Blue  and  Green  cabs  the  subject  might  have  seen.  You  have 
to  take  both  these  facts  into  consideration  to  calculate  the 
totals.  In  contrast,  the  population  totals  make  a  lot  of 
sense,  because  they  depend  on  only  one  kind  of  information,  not 
two  kinds.  The  total  number  of  Blue  cabs  in  the  city  is 
directly  calculated  as  a  percentage  of  the  total  number  of 
cabs,  regardless  of  what  the  witness  might  testify.  This 
distinction  is  important  because  it  shows  you  another  way  of 
telling,  in  any  problem,  which  is  the  population  information 
(that  you  start  with  in  Step  44)  and  which  is  the  specific 
information.  The  population  information  is  information  that 
directly  translates  into  number  totals.  The  specific 
information  is  information  that  does  not  translate  into  number 
totals  because  those  number  totals  depend  not  only  on  the 
specific  information  but  also  on  the  population  information. 

In  summary,  here  are  two  criteria  (one  discussed  earlier) 
for  telling  which  is  which: 

The  population  information: 

(a)  is  general,  background  information  and 

(b)  can  be  translated  directly  into  number  totals. 

The  specific  information: 

(a)  specifies  or  identifies  one  case  and 

(b)  cannot  be  directly  translated  into  number  totals 
because  those  totals  also  depend  on  the  population 
information. 

Step  fc.  Cross  out  the  false.  The  witness  in  the  story  in 
fact  testified  that  the  cab  was  Blue.  Bo  the  number  of  times 
the  witness  might  have  said  "Green"  is  irrelevant  to  the 
problem.  We  cross  out  these  false  cells  so  we  won't  be  tempted 
to  use  them  in  the  next  step: 
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Cabs  in  ths  City 


Blue 

Witness  said: 

70 

270 

Green 

X 

io  not  forget  to  cross 

100 

900 

1000 

Step  7.  Find  the  needed  probability.  The  two  remaining 
cells  are  what  we  need  to  answer  the  question.  They  show  that 
the  witness  would  have  said  “Blue'*  correctl v  70  times  and  would 
have  said  “Blue"  incorrectl v  270  times.  From  these  two  numbers 
we  can  get  our  probability. 

If  you're  not  used  to  thinking  about  probabilities,  a  nice 
way  to  think  about  them  is  to  imagine  that  you  fill  an  urn  with 
70  balls  labeled  “cab  is  really  Blue"  and  270  balls  labeled 
“cab  is  really  Green",  for  a  total  of  340  balls.  Now  sample 
one  ball  at  random  from  the  urn.  What  is  the  probability  that 
the  ball  will  be  labeled  “cab  is  really  Blue?"  the  answer  is 
the  number  of  “cab  is  really  Blue”  balls  divided  by  the  total 
number  of  balls  in  the  urn: 

70  70 

— - - .a - =. 21  (well,  it|s  really  .2058... but  we  rounded  it) 

70+270  340 

In  other  words,  we  divide  the  number  in  the  TARGET  cell  tv  the 
sum  of  the  two  numbers  left  in  our  table.  The  TARGET  cell  is 
the  one  cell  identified  by  both  the  specific  information  given 
in  the  problem  ("a  witness  identified  the  cab  as  Blue")  and  the 
question  asked  at  the  end  of  the  problem  (“What  is  the 
probability  that  the  cab  involved  in  the  accident  was  Blue?"). 
So  the  target  cell  is  the  “cab  is  Blue/Witness  said  Blue"  cell. 

That  it.  The  answer,  .21,  is  the  probability  that  the  hit- 
and-run  cab  was  a  Blue  cab. 

Are  you  surprised  by  the  answer?  Host  people  think  that 
the  correct  answer  should  be  .70,  the  same  as  the  witness* 
accuracy.  They  tend  to  forget  the  population  information,  that 
is,  they  fail  to  notice  that  because  there  are  so  many  more 
Green  cabs  than  Blue  cabs,  there  are  also  many  more 
opportunities  for  the  witness  to  be  wrong  when  saying  Blue. 


j 


Comment  While  it’s  not  necessary  to  solve  the  problem,  it 
might  help  you  to  understand  what "a  going  on  by  thinking  about 
this:  What  if  the  witness  had  testified  that  the  cab  was 
Green?  Look  back  at  the  last  table,  the  one  with  two  crossed- 
out  cells.  Those  crossed-out  cells  show  30  really  Blue  cabs 
and  6 30  really  Green  cabs.  So  the  probability  that  the  cab  is 
really  Green,  if  the  witness  said  it  was  Green,  isi 

630  630 

- - - ..93 

630+30  660 

This  probability  is  higher  than  either  the  proportion  of  Green 
cabs  in  the  city  <90V.)  or  the  accuracy  of  the  witness  (70V.). 
That's  because  in  this  case  both  pieces  of  information —  the 
population  proportion  and  the  witness’  testimony,  point  in  the 
same  direction,  towards  Green. 

Intermediate  probabilities  like  .21are  found  only  when  the 
two  pieces  of  information  point  in  opposite  directions;  the 
witness  said  Blue  but  most  cabs  are  Green. 

That's  the  end  of  the  tutorial.  On  the  next  page  is  a 
problem  for  you  to  do.  Before  doing  the  problem: 

1.  Review  the  tutorial  to  make  sure  you  understand  it. 

2.  Ask  any  questions  you  have. 

When  you  are  ready,  proceed  to  the  problem  on  the  next 
page.  We  are  interested  in  how  effective  the  tutorial  is  in 
teaching  you  how  to  do  such  problems.  8o  while  you  are  doing 
the  problem,  feel  free  tos 

1.  Review  the  tutorial  again. 

2.  Use  a  hand  calculator. 

3.  Ask  question. 

Please  work  the  following  problem  using  the  method  just 
described.  We’ve  drawn  you  a  table  to  work  with 

A  light  bulb  factory  uses  a  scanning  device  which  is 
supposed  to  put  a  mark  on  each  defective  bulb  it  spots  in  the 
assemble  line.  Eighty  five  percent  <B5%>  of  the  light  bulbs 
on  the  line  are  OK;  the  remaining  157.  are  defective. 
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The  scanning  device  is  known  to  be  accurate  in  BO"/.  of  the 
decisions,  regardless  of  whether  the  bulb  is  actually  ok  or 
actually  defective.  That  is,  when  a  bulb  is  good,  the  scanner 
correctly  identifies  it  as  good  BO*/,  of  the  time.  When  a  bulb 
is  defective,  the  scanner  correctly  marks  it  as  defective  80% 
of  the  time. 

Suppose  someone  selects  jne  cf  the  light  bulbs  from  the 
line  at  random  and  gives  it  to  thu  scanner.  The  scanner  mnrl;e 
this  bulb  as  defective. 

What  is  the  probability  that  this  bulb  is  really  defective? 


Step  1.  Draw  a  table.  Done. 

Step  2.  label  the  table. 

Step  3.  fission  an  arbitrary  grand  total.  Use  1,000. 

Step  4.  Estimate  the  population  to  a.1  s.  First  decide  which  set 
of  information  is  population  information.  Then  divide  the 
1,000  into  two  parts,  using  information  from  the  problem. 

Sten  5-  Fill  in  the  cells.  Divide  each  of  your  estimated 
totals  among  its  two  cells,  according  to  the  information  in  the 
problem. 

Step  6.  Cross  out  the  false.  Cross  out  the  two  cells  that  are 
contradicted  by  the  information  given  in  the  problem. 

fttea  7.  Find  the  needed  probability.  Write  the  relevant  numbers 
in  the  top  and  bottom  of  the  fraction  and  convert  the  fraction 
to  a  decimal  answer. 

tt  in  target  cell 

- - - - - as - • - - - »s - as  .  , answer . 

Sum  of  4Tb  in  both  cells 


A.  Out  of  1,000  light  bulbs  produced  by  ths  factory,  How 
many  are  defective?  Multiply  the  percentage  of 
defective  bulbs  by  1,000*  (First  convert  the  percentage 
value  to  a  decimal  value  before  multiplying). 


1,000  m 


Proportion  of 
Defective  Bulbs 


B.  Subtract  you  estimate  in  (A)  from  1,000  to  get  the 
number  of  bulbs  out  of  1,000  that  are  NOT  defective. 

1,000  -  (A) _ «= _ (B) 


C.  What  percentage  of  the  time  is  the  scanner  able  to 

correctly  Identify  light  bulbs  that  are  actually 
defective?  (from  the  problem)  _ (C> 

D.  What  percentage  of  the  time  is  the  scanner  able  to 

correctly  identify  light  bulbs  that  are  actually  not 
defective?  (from  the  problem) _ _ _  (D> 

E.  Look  over  the  following  table: 


LIGHT  BULBS  ARE: 


•ecti  v? 


Scanner 
Say  is 
Defective 

Scanner 
Says  is  NOT 
Defective 


Box  tt  1 


Box  #  2 


Box  #  4 


Box  #  3 


F.  Write  the  number  of  defective  light  bulbs  from  (A)  on 
the  line  labeled  (a)  in  the  table  above.  Just  below  Box 
#  2. 

G.  Write  the  number  of  non-defective  light  bulbs  from  (b> 
on  the  line  labeled  (b)  in  the  table  above,  Just  below 
Box  #  3. 


■  ,* :  >  ■ 


v*\V  \»> 1 
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H.  Multiply  the  percentage  value  in  (c)  by  your  estimate 

from  (A).  (First  convert  the  percentage  value  to  a 
decimal  value  before  multiplying). 

(A) _ x  (C) _ - _ (H> 

Write  you  value  for  (H)  in  Box  #1.  . 

I.  Subtract  you  value  in  (H)  from  you  value  in  (A). 

(A)  _ _ -  <H> _ - _ (1) 

Write  you  value  for  < I)  in  Box  #  2. 

J.  Multiply  the  percentage  value  in  (D>  by  your  estimate 

from  (B).  (First  convert  the  percentage  value  to  a 
decimal  value  before  multiplying). 

(B)  _ x  (D) _ • _ <J> 

Write  your  value  for  <J)  in  Box  #  3. 

K.  Subtract  your  value  in  <J>  from  your  value  in  (P> . 

<B> _ -  <J) _ * _ (K> 

Write  you  value  for  <K)  in  Box  #  4. 

L.  Add  the  numbers  in  Baxes  #  1  and  #  4. 

Box  4  1 _ _ _ _ +  Box  #  4 _ * _ _ <L) 

Write  you  value  for  (L)  on  the  line  labeled  (L) ,  to  the 
right  of  the  boxes. 

M.  To  get  the  final  answer,  divide  your  value  in  Box  #1  by 
your  value  for  (L> . 

Box  #  1 _ _  s  <L>  »  (M) 


A  factory  that  manif actures  parachute-brakes  for  aircrafts, 
uses  a  special  device  for  checking  the  parachutes. 


Eighty  five  percent  of  the  parachutes  are  in  order)  the  other 
15%  are  defective. 

The  device  is  known  to  be  accurtate  in  70%  of  the  cases.  That 
is,  70%  of  the  parachute  that  are  in  order  and  70%  of  defective 
parachues  will  be  correctly  identified  as  such. 

A  parachute  was  randomly  selected  from  the  line,  and  was 
checked  by  the  device.  The  device  identified  it  as  defective. 

Wfr.at_..ar.g,.  the  Chances  that  tfr,e  par**£.fru.tf?  is  really,,  defgct iye? 


Ifre  q.aundi,Cfi_Pratolfigl 

Jaundice  is  a  disease  that  may  occur  in  two  different  forms: 
viral  and  infectious. 

Twenty  percent  of  Jaundice  cases  are  infectious)  the  other  B0% 
are  viral.  The  symptoms  of  the  two  forms  are  identical,  but  the 
treatment  is  different.  Inadequate  treatment  may  cause  severe 
side  effects,  and  therefore  the  type  of  Jaundice  in  each  case 
must  be  identified  correctly. 

A  certain  blood  test  is  used  to  distinguish  between  the  two 
forms  of  Jaundice.  The  test  results  are  known  to  be  accurate  in 
80%  of  the  cases.  That  is,  both  infectious  and  viral  Jaundice 
will  be  correctly  idenfitied  as  such  in  80%  of  the  cases. 

A  soldier  who  had  symptoms  of  Jaundice  was  administered  this 
blood  test.  The  test  results  indicated  infectious  Jaundice. 

What  are  the  chances  that  the  soldier  really  had  Infectious 
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Generalization  Problems 


The  Ml agile  Problem 

Intelligence  sources  revealed  that  a  hostile  Army  had  purchased 
sophisticated  G-7  anti  aircraft  missiles. 

Israeli  Industry  had  developed  a  special  device,  capable  of 
receiving  the  signals  broadcasted  from  anti  aircraft  missiles 
that  enabled  identification  of  missile  type.  The  device  is 
known  to  be  accurate  in  80%  of  the  cases,  that  is,  a  G-7 
missile  type  and  missiles  of  "other  types"  will  be  correctly 
identified  as  such  in  80%  of  the  cases.  Knowledge  of  the  exact 
type  of  missile,  improves  the  defence  profile  an  aircraft 
flying  in  bound  area. 

Researches  done  by  the  Tactical  Warfare  Development  Committee 
show  that  the  chances  for  launching  a  G-7  type  missile  is  10%. 

Neutral  risk  -  An  anti  aircraft  missile  has  been  sent  to  a 

certain  area,  and  was  identified  by  the 
device  as  being  of  type  G-7. 

General  risk  -  An  anti  aircraft  missile  has  been  launched 

to  a  certain  area  where  only  Israeli 
aircraft  fly,  and  was  identified  by  the 
device  as  being  of  type  G-7. 

Personal  risk  -  Suppose  you  are  a  pilot  flying  an  Israeli 

aircraft.  An  anti  aircraft  missile  that  had 
been  launched  to  the  area  where  you  are 
flying  was  identified  by  the  device  as 
being  of  type  G-7. 


What  are  the  chances  that  this  missile  is  really  a  type  G-7 


rXiKiKw 


Duo  to  failures  in  production  20%  of  NBC  (Nuclear  biological  t< 
Chemical)  Masks  are  defectivei  the  other  60%  are  in  order. 

A  defective  mask  can  be  identified  by  using  a  simple  device. 
This  device  is  known  to  be  accurate  in  95%  of  the  cases.  That 
is  a  defective  mask  and  a  mask  which  is  in  order  will  be 
correctly  identified  as  such  in  95%  of  the  cases. 

Neutral  risk  -  A  mask  was  identified  as  defective. 

General  risk  -  A  mask,  which  is  part  of  the  personal 

equipment  of  a  certain  soldier,  was 
Identified  as  defective. 

Personal  risk  -  Due  to  warning  of  potential  NBC  attack, 

soldiers  were  given  personal  masks.  Your 
mask  was  identified  as  defective. 
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Ibg..fig.eLfi  „  pr.flb  1  «ro 

Ten  percent  of  brake-cylinder  seals,  used  as  spare  parts  for 
armoured  vehicles  are  defective. 

A  special  device  is  used  to  check  the  seals.  The  device  is 
known  to  be  accurate  in  95%  of  the  cases.  That  is  to  say,  a 
good  seal  and  a  defective  seal  will  be  correctly  identified  as 
such  in  95%  of  the  cases. 

Neutral  risk  -  A  seal  was  checked  and  was  found  to  be  in 

order. 

General  risk  -  A  seal  was  checked  and  was  found  to  be  in 

order.  The  seal  was  instated  in  an  armoured 
vehicle  which  was  sent  on  a  dangerous 
mission. 

Personal  risk  -  A  seal  was  checked  and  was  found  to  be  in 

order.  The  seal  was  instaled  in  an  armoured 
vehicle  in  which  your  are  a  crew-member, 
and  this  vehicle  was  sent  on  a  dangerous 
mission. 

What  are  the  chances  that  the  seal  is  really  in  order? 


yw.  an  \rj\  ym  ~ii 


Consider  the  following  problems 


8%  of  the  male  population  is  color  blind. 

A  test  of  color  blindness  is  known  to  be  accurate  in  907.  of 
the  cases,  that  is,  907.  of  color  blind  men  will  be 
correctly  identified  by  the  test  as  color  blind.  Of  those 
who  are  not  color  blind,  90%  will  be  correctly  identified 
as  having  normal  color  perception. 

A  certain  man  was  classified,  according  to  test  results,  as 
color  blind. 

What  are  the  chances  that  this  man  is  really  color  blind? 


Answer 


Research  has  shown  that,  when  presented  with  such 
problemsprobability  estimates  tha  vary  as  followsx  90%;  87.; 
7.27.;  417.. 

The  reason  for  such  a  variety  of  answers  is  that  people 
understand  the  problem  and  its  solution,  in  different  ways,  not 
all  of  which  are  correct. 

In  the  following  pages  we  present  a  tutorial  to  help  you 
solve  such  problems  correctly.  Please  read  through  the 
tutorial  carefully,  and  solve  the  problems  that  will  be 
presented  to  you,  according  to  the  instructions. 


Tutorial 


The  population  discuused  in  the  previous  problem  is  the 
male  population.  Suppose  the  circle  ms  have  drawn  represent 
this  population. 


We  already  know  that  &7.  of  this  population  n  color  blind. 
The  following  drawing  illustrates  the  division  of  the  overall 
papulation  (the  male  population)  into  two  sub  populations 
(those  who  are  color  blind  and  those  who  have  normal  color 
vision ) . 


frass 

Area  WWS»1  repreMntstha  man  who  are  oolor  b£nd; 

Area  |  |  reprewnt*  tha  man  who  havanotmal  color  vision. 
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Tha  problem's  story 1 in*  states  that  "A  t*st  of  color 
blindness  is  known  to  be  accurate  in  90X  of  the  cases".  If  the 
entire  male  population  were  tested,  it  would  be  possible  to 
divide  the  population  into  two  groups  of  people.  One  group 
would  contain  those  wen  who  would  •-.*  identified  as  color'  blind 
and  the  other  would  contain  those  een  who  would  be  Identified 
as  having  normal  color  vision.  But,  since  the  test  is  only 
partially  accurate,  division  of  the  population  according  to 
test  results  would  not  represent  the  actual  "state-of-the- 
world".  The  division  according  to  test  results  is  illustrated 
in  the  following  drawing. 


Atm 


rapfMtnts  the  man  that  would  ba  Id*  mi  tod 
by  tha  tad  aa  color  blind  (this  ar*a  was  originally  red); 


Araa  rapraaanta  tha  man  that  would  b*  Idantiflad 

by  tha  last  as  having  normal  ookx  vision  (this  area 
was  cnginaly  ydow 


.*•  .vt%  •. 
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To  illustrate  th»  inaccuracy  of 
the  two  drawings. 


tha  test  we  can  superimpose 


Area 

Area 

Area 

Area 


represents  the  color  blind  men  that  would  be  Identified 
by  the  test  as  color  blind; 

represents  color  blind  men  that  would  be  identified 
by  the  test  as  having  normal  color  vision; 

represents  men  having  normal  color  vision  that 
would  be  identified  by  the  test  as  oobr  bCnd; 

represents  men  having  normal  ootor  vision  that 
would  be  Identified  by  the  test  as  having  normal  color  vision. 


Nota  that  in  soma  of  tha  cases  the  test  results  are  in 
error,  that  is,  they  do  not  reveal  reality.  In  fact  there  are 
two  types  of  errors: 

»  A  color  blind  man,  who  is  identified  by  the  test  a 
having  normal  color  vision. 

Tulfg  ftlirm"*  A  man  having  normal  color  vision,  who  is 
identified  by  the  test  as  color  blind. 

An  alternative  way  to  present  the  total  population  and  its 
division  to  the  various  sub  population  is  by  using  a  "tree". 
Look  through  the  '‘tree”  carefully. 


Identified  as 
Having  Normal 
Color  Vision 


Identified  as 
Color  Blind 
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Th*  problem's  storyline  also  statei  "A  certain  man  was 
classified,  according  to  taat  results,  aa  color  blind".  Thi * 
Mans  that  this  man  is  a  msmbsr  of  tha  sub  population  antitlsd 
"total  ldsntifiad  as  color  blind"  in  ths  "tree"* 


We  alrsady  know,  that  this  sub  population  ("identified  as 
color  blind")*  includes  two  kinds  of  mem 

a.  Color  blind  men  who  were  identified  by  the  test  as  color 
blind| 

b.  Men  having  normal  color  vision  who  were  identified  by 
the  test  as  color  blind  (false  alarm). 

We  are  requested  to  indicate"  What  are  the  chances  that 
this  man  is  really  color  blind?".  In  other  words,  out  of  all 
the  men  who  were  identified  by  the  test  as  color  blind,  what  is 
the  actual  proportion  of  color  blind  men?  The  appropriate 
calculation  is: 

Color  Blind  Man 

Identified  u  Such  **»»•  Chances  that 

— ■ — — — - -  -  the  Man  Is  Really 

The  Total  No.  of  Color  blind 

Men  Identified 
as  Coior  blind 


In  order  to  calculate  this,  we  have  to  determine  the 
appropriate  values  for  each  the  nub-population  represented  in 
the  "tree"  and  the  drawing.  This  can  be  done  by  using  data 
extracted  from  the  problem  itself,  and  according  to  the 
fallowing  instructions. 

1.  Suppose  the  male  population  contains  1000  men.  This 
value  will  be  written  In  the  "tree"  on  page  9,  in  the 
frame  marked  A.  - 

2.  Of  these  1000  men,  how  many  are  really  color  blind? 

1000_  X _ .8 _ - _ W  _ 

Tha  Mai  Tha  Preportion  of  Tha  Total  No.  of 
population  Color  Blind  Man  Color  Blind  Man 


This  value  will  be  written  In  the  "tree"  on  page  9,  in 
the  frame  marked  B. 


:u'jk 


i 


3.  Of  these  1000  men,  how  many  have  normal  color  vision? 


Th«  total 
population 


Tha  Total  no.  of 
Man  having 
Normal  Color  Vlelon 


Tha  Total  No.  of 
Color  Blind 


The  resulting  value  will  be  written  In  the  "tree"  on 
.  page  9,  in  the  frame  marked  C. 

4.  Of  all  color  blind  men,  how  many  will  be  identified  as 
color  blind? 

80  X  .  .80  -  72 

•  •  ®  e  e  ■  •  at  M  MWWMMWWWW 

Tha  Total  No.  of  Tha  accuracy  of  Tha  Total  No.  of 
Color  Blind  test  raaulta  Color  Bind  Man 

Identified  as 
Color  Blind 

The  resulting  value  will  be  written  In  the  “tree"  on 
page  9,  in  the  frame  marked  D. 

5.  Of  all  color  blind  men,  how  many  will  be  identified  as 
having  normal  color  vision? 


Tha  Total  No.  of  The  Total  No.  of  Color  Blind  Identified 

Color  Blind  men  Color  Blind  Men  as  Having  Normal 

Identified  as  Color  Viaion 

Color  Blind 

The  resulting  value  will  be  written  In  the  "tree"  on 
page  9,  in  the  frame  marked  E. 

6.  Of  all  the  men  having  normal  color  vision,  how  many  will 
be  identified  as  having  normal  color  vision? 


Tha  Total  No.  of  Tha  accuracy  of  men  having  Normal 

Man  Having  taat  rasults  color  vision 

Normal  Color  Vision  identified  se  such 


The  resulting  value  will  be  written  In  the  “tree"  on 
page  9,  in  the  frame  marked  F. 


7.  Of  all  man  having  normal  color  vial on ,  how  many  will  be 
identified  as  color  blind? 

-  '  .---.H8----.  "  - _ 92  _ _ 

Tha  Total  Nad  Man  Having  Man  Having  Normal 

Man  Having  Normal  Color  Viaion  Color  Viaion  Idantiflad 

Normal  Color  VWon  Mantlfiad  aa  Such  a*  Color  Blind 

The  resulting  value  will  be  written  In  the  "tree"  on 
page  9,  in  the  frame  marked  G. 

8.  What  is  the  total  of  men  identified  color  blind? 

jT2_ _  ♦ _ _82 _ " _ J64 _ 

Tha  Total  No.  of  Man  Having  Normal  Tha  Total  No.  of 

Color  Blind  Man  Color  Viaion  Idantiflad  Man  Identified  as 

Identified  as  u  Color  Blind  Color  Blind 

Color  Blind 

The  resulting  value  will  be  written  In  the  "tree"  on 
page  9,  in  the  frame  marked  H. 

9.  What  is  the  total  of  men  identified  as  having  normal 
color  vision? 

828  +  8  .  836 

Man” Having  '  Color  BUnd  identified  Tha  Total  No.  of" 

Norms!  Color  Vision  u  Having  Normal  Man  IdantUieo  as 

Idantiflad  as  Such  color  Vision  Hiving  Normal 

Color  Vielon 

The  resulting  value  will  be  written  In  the  “tree"  on 
page  9,  in  the  frame  marked  I. 
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Now  we  have  all  the  data  required  to  calculate  the  target 
probability.  The  calculation  will  be  as  -follow*! 

72  :  164  -  0.41 

Color  Blind  Man  Th#Total  No.  of  Th*  Chine**  that 

Uantlgltd  m  Such  Man  ktontiilad  tha  Man  it  Ratty 

M  Color  Blind  Color  Blind 


Notice!  The  correct  answer  is  0.41,  that  is,  41%  chance 
that  a  man  who  was  identified  according  to  test  results  as 
color  blind,  is  really  color  blind. 

This  answer  may  seem  unreasonable  to  you.  In  this  case, 
you  are  advised  to  go  over  the  tutorial  again.  You  may  be  able 
to  understand  the  situation  discussed  in  the  problem  better, 
and  the  method  of  solution,  if  you  think  about  how  the  final 
results  (i.e. ,  the  chances  that  a  man  who  was  identified 
according  to  test  results  as  color  blind  really  being  color 
blind)  would  change,  if  the  percentage  of  color  blind  men  in 
the  overall  population  was  different. 

This  will  be  illustrated  by  drawings.  Each  one  of  the 
following  drawings  represents  a  population  in  which  the 
percentage  of  color  blind  men  is  different.  The  degree 

of  accuracy  of  test  results  remains  constant. 


,K 
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Notice!  In  each  drawing,  you  can  not^the  size  of  the  area 
marked  m|  relative  to  the  area  marked^H^  This  is  the 
proportior^of  color  blind  men  Mho  are,  intact,  color  blinde 
and  who  were  identified  as  such.  In  addition  you  can  note  the 
size  of  the  error  -  misses  +  false  alarms  -  of  the  test 
results. 

The  situations  discussed  here,  contain  two  "rules'*  used  to 
determine  the  target  probability.  These  "rules"  arei 

1.  The  distribution  of  the  phenomenon  in  the  population. 

2.  The  degree  of  test  accuracy  (to  what  degree  they  reveal 
reality) . 

Changing  the  value  of  either  "rule"  will  change  the  size  of 
the  various  errors,  and  therefor  the  target  probability. 

A  special  case  is  that  in  which  the  percentage  of  color 
blind  men  in  the  population  is  SO.  Here,  the  phenomenon  of 
color  blindness  is  distributed  randomly.  That  is,  the 
phenomenon  is  not  distributed  according  to  any  “rule",  and 
therefore  this  information  has  no  influence.  In  this  case,  the 
only  relevant  information  is  the  percentage  of  cases  in  which 
the  test  results  are  accurate  (in  our  story,  907.) . 

This  is  the  end  of  the  tutorial.  In  the  following  pages 
you  will  be  presented  with  additional  problems.  You  are 
requested  to  solve  these  problems  according  to  the  tutorial. 
Before  continuing  you  are  advised: 

1.  To  read  the  tutorial  again,  and  make  sure  you  understand 
it. 

2.  If  you  have  any  questions,  you  may  consult  the 
experimenter . 

When  your  are  ready,  go  on  to  the  next  page.  while 
answering  the  next  two  problems,  ypu  may: 

1.  Read  the  tutorial  again. 

2.  Use  a  calculator. 


3.  Ask  questions 
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The  data  in  Tabl®  33  show  that  on  th®  average  subjects  are 
generally  confidant  in  th®  accuracy  of  their  answers.  The  mean 
confidence  rating  across  all  the  data  is  5.60.  The  mean 
ratings  are  higher  for  the  A  groups  than  the  UA  groups.  The 
mean  ratings  are  also  higher  for  the  NTL  groups  than  the  TL 
groups.  Of  the  various  risk  level  groupsf  the  mean  confidence 
rating  for  NR  groups  is  the  highest,  followed  by  the  PR  groups 
and  then  the  6R  groups.  An  analysis  of  variance  performed  on 
these  data  showed  significant  main  effect  for  the  aid  condition 
<F<1, 151)«8.42,  p<.01). 

Reasonableness 

The  Reasonableness  judgement  were  found  to  be  unaffected  by 
the  aid,  time  limit  and  risk  conditions.  The  number  of  "yes" 
responses  for  all  three  generalization  problems  for  each 
subject  was  computed.  An  analysis  of  variance  performed  on 
these  data  failed  to  reach  significance. 

Subjective  Mental,  Load 

The  mean  ratings  and  s.  d.  for  difficulty,  mental  effort, 
fatigue,  frustration,  subjective  time  stress  and  the  subjective 
mental  load  measure,  are  shown  in  Tables.  34  to  3G. 

Table  34:  ttean_.apd  s,d,  of  difficulty  Rating  (Upper  No.= 
Meaojt,__Lo_wer  .No.  «s .  d . ) 


\  AID 

timeN^ 

AIDED 

UNAIDED 

TOTAL 

UNLIMITED 

2.98 

2.75 

2.85 

:  : 

1.78 

1.75 

1.76 

LIMITED  ; 

320 

2.91 

3.05 

1.55 

1.89 

1.76 

TOTAL 

3.09 

2.B3 

2.S5 

- 

1.66 

1.81 

1.74 

>4 


*4 


*  V  1.-  V  *  Uti:?.  l  r  -  n  i  *  u-nniMrji  «.  m 
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Table  38:  Mean  and  s.d,  of  Subjective  Mental  La&d„(.u.pper 
Np.«=Heani  Lower  No. -s.d.) 


'v  AID 

time's,^ 

v.:  :Mk. 

AIDED 

UNAIDED 

TOTAL 

UNLIMITED 

2.81 

2.72 

2.77 

V-:; 

1.56 

1.69 

1.75 

UMITED 

3.12 

2.55 

2.64 

1.57 

1.62 

1.77 

TOTAL 

3.01 

2.64 

2.62 

4:  •• '< 

7.': 

1.66 

1.49 

1.65 

Tables  34  to  38  indicate  that  the  mean  ratings  of 
difficulty,  mental  effort,  fatigue  and  the  computed  subjective 
mental  load  measure  were  higher  for  the  TL  groups  than  the  NTL 
groups,  and  higher  for  the  A  groups  than  the  UA  groups,  the 
ratings  are  also  higher  for  NR  groups  followed  by  GR  groups  and 
the  PR  groups,  in  that  order.  The  mean  ratings  for  frustration 
were  similar  to  the  above  for  time  limit  and  risk,  but  higher 
for  the  UA  groups  than  the  A  groups. 

The  mean  ratings  were  similar  for  difficulty,  mental  effort 
and  fatigue  ratings  for  aid  and  risk,  but  higher  for  the  TL 
groups  than  the  NTL  groups. 

An  analysis  of  variance  on  these  data  indicated  significant 
effect  of  time  limit  (F ( 1 , 192) =7. 88,  p<.01>  and  aid 
(F ( 1 , 192) =17. 76,  p<.01)  on  subjective  time  stress  only.  Other 
main  effect  failed  to  reach  significance.  The  two-way 
interaction  of  time  limit  x  aid  was  found  to  be  significant  for 
the  mental  effort  <F ( 1 , 192) *2. 56,  p<.05),  fatigue 
<F ( 1 , 192) -13. 31 ,  p<„01)  and  time  stress  (F < 1 , 192) “5. 06,  p<.05) 
dimensions.  The  two-way  interaction  for  time  limit  x  aid  was 
found  to  be  significant  for  the  effort  dimention 
<F(1, 192)!S7.88) ,  p<.01).  The  two-way  interaction  for  time 
limit  x  risk  leve  was  found  to  be  significan  for  the  fatigue 
(F<1, 192)*13.31,  p<.05)  and  subjective  time  stress 
<F  <  1 ,  192)  <=4. 41 ,  p<.05)  dimensions.  The  interactions  are  shown 
in  Figures  3  to  5. 


Ratings, as  Function  of  rime 
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RATINGS) 
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Figure  4:  Mean  Fatigue  Ratings  as  Function  of  Time 
Restriction  and  Ri«k  Conditions. 
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Figure  5s  Mean  Time  Stress  Ratings  as  Function  of  Time 
Restriction  and  Risk  Condi tions 
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Simple  tutorial  Vs.  TbMI 

The  results  o-f  Experiment  III,  obtained  with  the  aid  of  the 
tutorial,  were  compared  with  those  obtained  with  the  TbMI  aid 
under  unlimited  time  conditions  and  neutral  risk,  when 
performing  the  missile  problem  in  Experiment  IV.  This  group 
was  selected  since  it  represents  experimental  conditions 
similar  to  those  of  Experiment  III.  The  Missile  problem  was 
selected  since  it  was  the  first  general izati on  problem 
presented  to  the  subjects.  Subjects'  responses  to  the  compared 
generalization  problems  are  shown  in  Table  39.  The  data  in 
Table  39  show  that  the  proportion  of  "correct"  responses  was 
higher  for  subjects  in  Experiment  IV  (727.)  than  for  subjects  in 
Experiment  III  (557.). 


Table  39: 


AID  TYPE 


RESPO^^-s^ 

TUTORIAL 

TbMI 

CORRECT 

16 

55.2% 

13 

72.2% 

DIAGNOSTIC 

3 

10.3% 

1 

5.6% 

'BASE  RATE 

3 

10.3% 

0 

0% 

CONDITIONAL 

1 

3.4% 

0 

0% 

OTHER 

6 

20.7% 

4 

22.2% 

While  no  significant  difference,  between  the  two  response 
distributions,  was  shown  by  using  a  chi-square  test,  Table  39 
clearly  shows  an  overall  trend,  which  indicates  the  relative 
advantage  of  the  TbMI  method. 


The  major  -finding  of  experiment  IV,  is  that  the  use  of 
mental  images,  for  presentation  and  organization  of  the  verbal 
explanation  in  the  Training  by  Mental  Image  TbMI,  contributed 
considerably  to  the  effectiveness  of  this  aid,  under  all  the 
experimental  conditions.  The  results  indicate  that  the  TbMI 
method  succeeded  in  improving  performance,  creating 
constructive  change  in  the  conceptualization  of  base-rate 
problems,  and  in  acquiring  new  cognitive  skills  with  which  to 
examine  these  problems. 

The  way  of  solution  specified  in  the  tutorial  used  by 
Lichtenstein  &  MacGregor,  (1985)  and  in  Experiment  III,  was 
similar  to  the  one  specified  in  the  TbMI.  This  enabled 
comparison  between  the  two  tutorials  that  would  reveal  the 
contribution  of  the  mental  images  as  a  way  of  presentation. 

This  comparison  showed  that  the  TbMI  led  to  a  better 
generalization  than  the  original  tutorial. 

The  risk  manipulation  did  not  influence  performance.  This 
may  indicate  that  the  risk  element  in  the  generalization 
problems  did  not  affect  the  interpretati on  and  judgements  of 
relevance  of  the  different  information  types.  An  alternative 
explanation  is  that,  as  hypothesized,  the  cognitive  skills 
acquired  through  the  TbMI  method,  were  strong  enough  to 
overcome  the  risks  influence.  This  is  also  supported  by  the 
interaction  effect  of  time  limit  x  risk  for  the  confidence 
ratings.  That  is,  if  risk  had  no  influence  at  all,  the 
confidence  rating  would  not  be  influenced,  but  since  they  were, 
it  means  that  this  influence  was  removed  after  training. 

The  time  limit  manipulation  had  a  minor  effect  on  subjects 
performance.  This  also  indicates  of  the  effectiveness  of  the 
aid  and  its  generalized  effect,  especially  in  view  of  previous 
findings  that  framing  is  not  transferred  to  stress  condition 
(Zakay,  1984),  and  in  view  of  the  validation  of  time 
restriction  manipulation  in  Experiment  IV.  However,  this 
manipulation  did  affect  confidence  ratings  for  one 
generalization  problem.  This  may  indicate  that  although  the 
training  method  was  effective,  more  training  is  required,  in 
order  to  make  the  new  way  of  conceptualization  more  intuitive. 

The  degree  of  confidence,  in  the  accuracy  of  the  answers, 
to  two  of  the  generalization  problems,  was  influenced  by  the 
time  restriction  and  aiding  manipulations.  Trained  subjects 
reported  higher  confidence  then  untrained  subjects.  The 
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subjects  who  had  to  solve  the  generalization  problems  under 
time  limit  conditions,  reported  lower  confidence  than  those  who 
were  not  time  restricted.  This  indicates  that  the  training 
method  decreased  confusion  and  uncertainty  with  which  the 
untrained  subjects  had  to  deal,,  However  ,  more  training  may  be 
required  for  time  stress  conditions. 

The  reasonableness  judgements  were  not  affected  by  the  time 
restriction  and  the  training  manipulations.  This  is 
surprising,  in  light  of  their  influence  on  confidence  ratings. 
This  may  be  the  result  of  cognitive  dissonance. 

The  subjective  mental  load  measures  were  affected  by  aiding 
and  time  limit.  Higher  subjective  mental  load  was  reported 
when  subjects  performed  under  time  limit  condition,  and  when 
presented  with  the  training  method.  Similar  finding  was 
reported  by  Einhorn,  (1970),  who  found  that  using  heuristics 
require  less  effort.  It  should  be  noted,  that  the  subjects 
answered  the  subjective  mental  load  questionnaire  after 
completing  the  training  and  the  generalization  problems. 
Although  it  was  emphasized  that  the  questionnaire  related  only 
to  the  generalization  problems,  it  is  likely  that  the  training 
had  an  effect  on  these  measures. 

The  use  of  mental  images  for  presentation  and  organization 
of  verbal  material,  can  be  applied  in  developing  Computer  Aided 
Instruction  (CAI).  An  attempt  in  this  direction  has  already 
been  made.  Preliminary  program  for  interactive  learning,  using 
IBM— XT  computer,  was  developed.  This  program  focuses  only  on 
training  for  base-rate  problems  solution,  according  to  the  same 
method  used  in  the  TbMI.  Pretesting  has  revealed  that  this 
concept  is  promising,  but  needs  further  devel opement . 


LV 


A.  Regarding  the  algorithmic  decomposition  aid  -for 
estimating  unknown  quantities,  the  results  suggest  that 
in  developing  an  aid  or  training  method,  based  on  the 
algorithmic  approach,  the  unique  characteristics  of  the 
target  population,  should  be  taken  into  account.  The 
aid  and  the  training  method  must  be  adjusted 
accordingly,  in  order  to  be  compatible  with  the  thinking 
patterns  and  cognitive  style  of  the  target  population, 
□nly  after  the  aid  and  training  method,  are  adapted  to 
the  population,  the  members  can  compose  individual 
algorithms  to  match  its  content  and  organization  to 
their  own  cognitive  style  and  thinking  patterns. 

B.  The  TbMI  method  in  solving  base-rate  problems  is 
effective  and  led  to  systematic  change  of  the  way  in 
which  people  conceptualize  and  solve  base-rate  problems. 
This  method  should  be  further  developed, 

C.  The  use  of  mental  images  for  presentation  and 
organization  of  verbal  material,  can  be  applied  in 
developing  Computer  Aided  Instruction  (CAI). 

D.  In  view  of  the  success  of  the  TbMI  method,  in  solving 
base-rate  problems,  it  is  recommended  to  apply  this 
approach  as  a  training  method  for  general  estimation 
problems. 
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